boothead.s
Skip to line: 3100 - 3200 - 3300 - 3400 - 3500 - 3600 - 3700 - 3800 - 3900 - 4000 - 4100 - 4200 - 4284


Highlighted entries were made in the last day
Select a different time increment to highlight entries
Current GMT time: Jul 23 2008 21:45:30

If you have a comment for boothead.s, please click here.
Name: Christos KarayiannisEmail:  christos@kar.forthnet.grDate: Dec 05 2005 17:56:24 GMT
Subject:  line 3074
Reponse: At the comment of line 3074 there is a question about the "! Round down to even " code comment. The specific memory area should be an even one because the stack pointer is placed there. The sp certainly is better to be aligned at an even number in the address space, because at the stack are usually pushed 16-bit register values, variables of word size etc
 
Respond to Christos Karayiannis's comment.
 
 
3000   !       Boothead.s - BIOS support for boot.c            Author: Kees J. Bot
3001   !
3002   !
3003   ! This file contains the startup and low level support for the secondary
3004   ! boot program.  It contains functions for disk, tty and keyboard I/O,
3005   ! copying memory to arbitrary locations, etc.
3006   !
Expand/Collapse Item3007    ! The primary bootstrap code supplies the following parameters in registers:
These are the same values that were passed into bootblock.s.
3008   !       dl      = Boot-device.
3009   !       es:si   = Partition table entry if hard disk.
3010   !
3011
3012   .define begtext, begdata, begbss
3013   .data
3014   begdata:
3015           .ascii  "(null)\0"      ! Just in case someone follows a null pointer
3016   .bss
3017 begbss:
3018
Expand/Collapse Item3019            o32         =     0x66  ! This assembler doesn't know 386 extensions
In Makefile, boothead.s is compiled with the -mi86 option (LD86 contains the mi86 option).  This option uses the machine instructions (mi) of the 8086 system which does not have 32-bit registers (like eax, ebx, etc.).  If an instruction is needed that uses a 32-bit value, the 8086 instruction must be prefixed with 0x66.

Look at line 3933.  If the -mi86 option is used and the retf instruction has no prefix, the instruction jumps to the address specified by the last 2 bytes on the stack (the offset) and the next-to-last 2 bytes on the stack (the segment).  However, if the last 4 bytes on the stack are the offset and the next-to-last 4 bytes on the stack are the segment and the -mi86 option is used, the instruction must be prefixed with 0x66.  On lines 3922-3925, these 8 bytes are pushed on the stack.

Expand/Collapse Item3020            BOOTOFF     =   0x7C00  ! 0x0000:BOOTOFF load a bootstrap here
Expand/Collapse Item3021            LOADSEG     =   0x1000  ! Where this code is loaded.
Expand/Collapse Item3022            BUFFER      =   0x0600  ! First free memory
The bootstrap (which is bootblock.s) loaded this code (the secondary boot loader) at address 0x1000:0x0000.  If the user wishes to boot a different partition, the bootstrap from that partition is loaded at address 0x0000:0x7C00 and the boot process repeats itself (the bootstrap loads the secondary boot loader which loads the kernel). masterboot.s and bootblock.s describe this process in greater detail.
Expand/Collapse Item3023            PENTRYSIZE  =       16  ! Partition table entry size.
Expand/Collapse Item3024            a_flags     =        2  ! From a.out.h, struct exec
In your book, look at line 01400.  This is the header file a.out.h.  The first thing declared in this file is the struct exec.  All minix executables (with a few exceptions like bootblock and masterboot - these 2 files must begin with executable code) begin with headers.

a_flags is at an offset of 2 bytes, a_text is at an offset of 8 bytes, and so on.  a_flags describes the kernel (with the options shown on lines 3029-3033) and a_text, a_data, a_bss, and a_total are sizes.

Note that the A_SEP flag describes this executable (the secondary boot loader) whereas the K_I386, K_RET, K_INT86, and K_MEML flags describe the kernel.

3025           a_text      =        8
3026           a_data      =       12
3027           a_bss       =       16
3028           a_total     =       24
Expand/Collapse Item3029            A_SEP       =     0x20  ! Separate I&D flag
Read section 4.7.1 and the first 10 paragraphs of section 4.7.3 of Operating Systems and try to understand as much as you can.  Some of the terminology may be unfamiliar so I will give a short description of the concepts involved.

This executable (the secondary boot) is compiled with the -mi86 option and runs in real mode and not in protected mode.  For this reason, the secondary boot is not be able to take advantage of the protection features of protected mode.  However, since this is the first time we've run into the A_SEP flag, it's a good place to discuss shared vs. separate segments.

In protected mode, the text (code) and the data+bss+heap+stack (I will refer to this as the total data - see the next paragraph for a description of each of these) in an executable with separate text and total data segments are protected from one another.  For example, if the code tries to jump to a memory address that's within the total data segment, the hardware triggers a segment violation.  If they're not separate (A_SEP in a_flags is not set), chaos results.  Another advantage of separating the text and total data is that the text can be shared among multiple instances of the same program.  The total data will differ between two instances of the same program but the text will be the same.

Data contains initialized global variables, bss contains uninitialized global variables and must be initialized to zero (see lines 3091-1098), and the heap is the memory that malloc() allocates at run-time.

It's best to also keep the data+bss+heap and the stack separate - although Minix doesn't separate the two for the reasons given in section 4.7.3.  This means that if the heap or the stack grows too large, one can overwrite the other.  If the stack overwrites the heap and the overwritten data is not accessed immediately, identifying the problem is difficult.

On disk, the a_text field in the header holds the size of the text and the a_data field holds the size of the data.  If the kernel doesn't have separate text and total data segments, the variables a_data and a_text are combined into a_data and the variable a_text is set to zero (see lines 3069-3071).  Note that even though the values are changed in memory, they do not affect the values on disk. a_bss is the size of the bss.  a_total is the size of the data+bss+heap+stack (separate) or the text+data+bss+heap+stack (shared).  Unlike a_text, it doesn't need to be modified if the text and total data are shared. a_total determines the top of the stack (see lines 3075-3077) and is also used (with a_text) to determine the global variable _runsize (see lines 3127-3135) which is needed by boot.c in initialize().

Expand/Collapse Item3030            K_I386      =   0x0001  ! Call Minix in 386 mode
If the K_I386 flag is set for the kernel, this code must switch to protected mode.
Expand/Collapse Item3031            K_RET       =   0x0020  ! Returns to the monitor on reboot
Look at lines 3936 and 3942.  The minix kernel returns there on a halt or reboot if the K_RET is set for the kernel.  If the K_RET flag is not set, the system simply halts.

3032           K_INT86     =   0x0040  ! Requires generic INT support

Expand/Collapse Item3033            K_MEML      =   0x0080  ! Pass a list of free memory
The variable _mem (see line 3048) is used to pass this memory list.  The int 0x12 (see line 3141) and int 0x15 (see lines 3152 and 3157) bios calls are used to determine the low memory and high memory size.

3034

Expand/Collapse Item3035            DS_SELECTOR =      3*8  ! Kernel data selector
Expand/Collapse Item3036            ES_SELECTOR =      4*8  ! Flat 4 Gb
Expand/Collapse Item3037            SS_SELECTOR =      5*8  ! Monitor stack
Expand/Collapse Item3038            CS_SELECTOR =      6*8  ! Kernel code
Expand/Collapse Item3039            MCS_SELECTOR=      7*8  ! Monitor code
To support multiprocessing, the 80286 and up use global descriptor tables (GDT's).  p_gdt (line 4242) is the descriptor table.  Anything that is labeled UNSET must be filled in before the global descriptor table is loaded using the lgdt instruction (see line 4133).  These values are filled in on lines 3871-3897.

The following values are the offsets of the entries within the global descriptor table.  For example, since the entry for the kernel code is the 7th entry (see line 4267) and the size of each entry is 8 bytes, its offset is 6*8 (remember that the first entry has a 0 offset).  The MCS_SELECTOR is pushed onto the stack (if the K_RET flag is set for the kernel) before jumping to the kernel (look at lines 3918-3920) .  Also before the jump is made to the kernel, the ds and es registers are loaded with DS_SELECTOR and ES_SELECTOR, respectively.

3040

Expand/Collapse Item3041            ESC         =     0x1B  ! Escape character
0x1B is the ascii representation of ESC.

3042

Expand/Collapse Item3043    ! Imported variables and functions:
Memory for a variable can be allocated in only one file (i.e. the variable is "defined") but the variable must be declared as extern in every other file that accesses it.  To accomplish this, the macro EXTERN is #defined as the empty string in boot.c .  This prevents the EXTERN macro from being #defined as extern in boot.h when boot.h is #included in boot.c.  boot.h is also #included in bootimage.c.  Since EXTERN is not #defined (and is therefore undefined), EXTERN is replaced by extern in bootimage.c.  This mechanism ensures that memory for a variable is allocated only once.

A similar trick is used in the kernel.  Read the 5th paragraph of section 2.6.3 of Operating Systems for details.

Variables that are shared between assembler and C code are prefixed with an underscore ( _ ) in the assembler code but are not prefixed with an underscore in the C code.

Expand/Collapse Item3044    .extern _caddr, _daddr, _runsize, _edata, _end  ! Runtime environment
_caddr is the absolute address of the first byte of the text. _daddr is the absolute address of the first byte of the data. _runsize is the size of the entire executable (text+data+bss+heap+stack). I believe that _edata and _end are variables that are generated by the compiler. _edata is the offset address of the end of the data and _end is the offset address of the end of the bss.  These two variables are used on lines 3091-3098.  See the comment for line 3145 for further discussion of _edata and _end.
3045   .extern _device                                 ! BIOS device number
3046   .extern _rem_part                               ! To pass partition info
Expand/Collapse Item3047    .extern _k_flags                                ! Special kernel flags
_k_flags contains the K_I386 , K_RET, K_INT86, and K_MEML flags (lines 3030-3033).  _k_flags is set in bootimage.c.
3048   .extern _mem                                    ! Free memory list
3049
3050   .text
3051   begtext:
Expand/Collapse Item3052    .extern _boot, _printk                          ! Boot Minix, kernel printf
These functions are defined in boot.c. boot is called on line 3180.
3053
Expand/Collapse Item3054    ! Set segment registers and stack pointer using the programs own header!
Expand/Collapse Item3055    ! The header is either 32 bytes (short form) or 48 bytes (long form).  The
Expand/Collapse Item3056    ! bootblock will jump to address 0x10030 in both cases, calling one of the
Expand/Collapse Item3057    ! two jmpf instructions below.
3058
3059           jmpf    boot, LOADSEG+3 ! Set cs right (skipping long a.out header)
3060           .space  11              ! jmpf + 11 = 16 bytes
3061           jmpf    boot, LOADSEG+2 ! Set cs right (skipping short a.out header)
Expand/Collapse Item3062    boot:
Whether this code has a short header or a long header, the second instruction executed (after the first jump) is at address boot.

Before boot is called on line 3180, a few things are done.  (Don't confuse the two boot's; one's an address (line 3062) and the other's a function defined in boot.c (line 3180).)

Lines 3062-3080: The ds, ss, and sp registers are loaded.  The values loaded depend on whether this executable has a separate text and total data (A_SEP in a_flags is set) or this executable has a shared text and total data (A_SEP in a_flags is not set).

Lines 3092-3097:  Clear the bss.  The bss contains uninitialized global variables and needs to be zeroized.

Lines 3100-3135:  Initialize various global variables so that when boot (line 3180) is called, the C code can access their values.

Lines 3137-3177:  Initialize the array mem[].

Expand/Collapse Item3063            mov     ax, #LOADSEG
Expand/Collapse Item3064            mov     ds, ax          ! ds = header
What's the pound (#) sign all about?  The pound sign indicates that the value of LOADOFF is moved into the register rather than the contents of the memory location LOADOFF.

Why can't the instruction mov ds, #LOADSEG be used instead of using ax as an intermediate register?  The 80x86 processors forbids immediate data to segment register transfers.  (Immediate data is data that is within the instruction itself, as opposed to data that is at a memory location or data that is in a register.)  Memory to segment register transfers are also forbidden.  Only register to segment register transfers are allowed.  The one exception to this rule is the cs register.  The cs register is even more restrictive.  The only two instructions that can alter the cs register are jmpf (far jump) and return retf (far return) instructions.

3065
3066           movb    al, a_flags
Expand/Collapse Item3067            testb   al, #A_SEP      ! Separate I&D?
Expand/Collapse Item3068            jnz     sepID
testb sets the zero flag if A_SEP is not set in a_flags.  If the zero flag is not set (A_SEP is set), then jnz jumps to sepID.
Expand/Collapse Item3069    comID:  xor     ax, ax
This instruction zeroes the ax register (any number xor'ed with itself is zero).  This is a pretty common practice.  The instruction

mov ax, #0

is slower and is 3 bytes compared with xor's 2 bytes.

3070           xchg    ax, a_text      ! No text
3071           add     a_data, ax      ! Treat all text as data
3072   sepID:
3073           mov     ax, a_total     ! Total nontext memory usage

Expand/Collapse Item3074            and     ax, #0xFFFE     ! Round down to even
I'm not sure why we do this.  However, since the size of the stack is arbitrary and there should be plenty of room to spare, rounding down to an even value shouldn't be a problem.  The efficiency of transferring 2 bytes from an even memory address may be greater than transferring 2 bytes from an odd memory address.

3075           mov     a_total, ax     ! total - text = data + bss + heap + stack

Expand/Collapse Item3076            cli                     ! Ignore interrupts while stack in limbovv
Whenever a value is moved into the stack register (ss) or the stack pointer (sp), the interrupts must be first disabled.  The ss and sp registers hold the address to which an interrupt returns after its completion.  If the ss and sp register are in flux, one can't predict where the code will return.

Interrupts are disabled with the cli (clear interrupts) instruction and reenabled with the sti (set interrupts) instruction.

3077           mov     sp, ax          ! Set sp at the top of all that
3078
Expand/Collapse Item3079            mov     ax, a_text      ! Determine offset of ds above cs
Expand/Collapse Item3080            movb    cl, #4
Expand/Collapse Item3081            shr     ax, cl
Expand/Collapse Item3082            mov     cx, cs
Expand/Collapse Item3083            add     ax, cx
Expand/Collapse Item3084            mov     ds, ax          ! ds = cs + text / 16
Each segment register (cs , ds, es , ss,etc.) is internally appended with a 0x0 before being added to a non-segment register (like ip or ax ) to form an address.  For example, if the cs register holds the value 0x1000 and the ip register holds the value 0x1000, then together these registers point to address 0x11000.  So if we wish to add an offset (in bytes) to a segment register, we must first shift the offset 4 bits to the right (line 3081).
3085           mov     ss, ax
3086           sti                     ! Stack ok now
Expand/Collapse Item3087            push    es              ! Save es, we need it for the partition table
This value is popped into the upper 2 bytes of _rem_part on line 3105.
3088           mov     es, ax
3089           cld                     ! C compiler wants UP
3090
3091   ! Clear bss
3092           xor     ax, ax          ! Zero
Expand/Collapse Item3093            mov     di, #_edata     ! Start of bss is at end of data
Expand/Collapse Item3094            mov     cx, #_end       ! End of bss (begin of heap)
_edata and _end are variables that are set by the compiler.  _edata is the offset address of the end of the data and_end is the offset address of the end of the bss.

3095           sub     cx, di          ! Number of bss bytes
3096           shr     cx, #1          ! Number of words

Expand/Collapse Item3097            rep
Expand/Collapse Item3098            stos                    ! Clear bss
The instruction prefix rep repeats the instruction (in this case stos) cx times.  stos stores ax at the memory address es:di.  Since stos stores words and not bytes, cx must be shifted to the right by 1 (in other words, divided by 2).
3099
Expand/Collapse Item3100    ! Copy primary boot parameters to variables.  (Can do this now that bss is
Expand/Collapse Item3101    ! cleared and may be written into).
Since _device and _rem_part are uninitialized global variables, they are stored in the bss.

3102           xorb    dh, dh
3103           mov     device, dx     ! Boot device (probably 0x00 or 0x80)
3104           mov     _rem_part+0, si ! Remote partition table offset
3105           pop     _rem_part+2     ! and segment (saved es)
3106
3107   ! Remember the current video mode for restoration on exit.
3108           movb    ah, #0x0F       ! Get current video mode

Expand/Collapse Item3109            int     0x10
int 0x10,ah=0x0F returns the current video mode into al.  Some examples of return values are al=0x13 (VGA, 320x2100 resolution, 256 colors), al=0x12 (VGA, 640x480, 16), and al=0x0E (CGA, 640x240, 16).

I don't know what "blanking" is.  If you know, please submit a comment to the site which will be displayed below.

Name: Christos Basil Karayiannis - Karditsa GREmail:  christos@kar.forthnet.grDate: Feb 01 2004 16:42:09 GMT
Subject:  'Blanking', what is?
Reponse: At line 3110 0x7F is binary 01111111 and with andb instruction we mask off bit 7 to the returned to al value for video mode after the interrupt. This bit if set leaves video buffer as-is and if zero clears the display when we set video mode (int 0x10 ah=0x0). When used with interrupt of line 3108 (get video mode) shows if the last mode set cleared or not the video buffer. Hence comes the term 'blanking'.
 
Respond to Christos Basil Karayiannis - Karditsa GR's comment.
 
 
3110           andb    al, #0x7F       ! Mask off bit 7 (no blanking)
3111           movb    old_vid_mode, al
3112           movb    cur_vid_mode, al
3113
3114   ! Give C code access to the code segment, data segment and the size of this
3115   ! process.
3116           xor     ax, ax
3117           mov     dx, cs
Expand/Collapse Item3118            call    seg2abs
Line 3222 converts a segment:offset address in dx:ax to an absolute address in dx-ax. Note that the notation dx-ax does not mean dx minus ax It means that the lower 2 bytes are in ax and the upper 2 bytes are in dx.  This notation is used in other places in the code (for example, lines 3227-3243).
3119           mov     _caddr+0, ax
3120           mov     _caddr+2, dx
3121           xor     ax, ax
3122           mov     dx, ds
3123           call    seg2abs
3124           mov     _daddr+0, ax
3125           mov     _daddr+2, dx
3126           push    ds
3127           mov     ax, #LOADSEG
3128           mov     ds, ax          ! Back to the header once more
3129           mov     ax, a_total+0
3130           mov     dx, a_total+2   ! dx:ax = data + bss + heap + stack
Expand/Collapse Item3131            add     ax, a_text+0
Expand/Collapse Item3132            adc     dx, a_text+2    ! dx:ax = text + data + bss + heap + stack
If this executable has a separate text and total data segment, a_text must be added to a_total to get the total size of the executable.  If it has a shared text and total data segment, a_total is the size of the text and the total data.  However, a_text was set to zero on line 3070 and can be added anyway and it won't matter.
3133           pop     ds
3134           mov     _runsize+0, ax
3135           mov     _runsize+2, dx  ! 32 bit size of this process
3136
Expand/Collapse Item3137    ! Determine available memory as a list of (base,size) pairs as follows:
Expand/Collapse Item3138    ! mem[0] = low memory, mem[1] = memory between 1M and 16M, mem[2] = memory
Expand/Collapse Item3139    ! above 16M.  Last two coalesced into mem[1] if adjacent.
The memory (base, size) pairs will look something like this:
mem[0]=(0x00000000, size of low memory)
mem[1]=(0x00100000, size of memory between 1M and 16M)
mem[2]=(0x01000000, size of memory greater than 16M)

If the mem[1] and mem[2] memory areas are continugous, then mem[1] and mem[2] are combined.

Since mem[] is an uninitialized variable, it is found in the bss space, which was zeroized on lines 3091-1098.  The following instructions are not needed since these 4 bytes are already zero.

mov 0(di), #0
mov 2(di), #0

Also, since the lower 2 bytes of the base of both mem[1] and mem[2] are also zero, the following instructions are also not needed:

mov 8(di), #0
mov 16(di), #0

The lower 2 bytes of the lower memory size are stored in 4(di) and the upper 2 bytes are stored in 6(di) (lines 3443-3144).  Likewise, 12(di) and 14(di) hold the size of the memory between 1M and 16M.  20(di) and 22(di) hold the size of the memory above 16M.  Since int 0x15 , ax=0xE081 returns the number of 64K (not 1K) blocks of memory in bx (see line 3152), 20(di) will equal 0.

3140           mov     di, #_mem       ! di = memory list
3141           int     0x12            ! Returns low memory size (in K) in ax
Expand/Collapse Item3142            mul     c1024
c1024 is a memory address (see line 4207).  "c" stands for constant.  mul multiples ax by the operand (in this case the value at the address specified by the operand) and puts the lower 2 bytes of the result in ax and the upper 2 bytes in dx.
3143          mov     4(di), ax       ! mem[0].size = low memory size in bytes
3144           mov     6(di), dx
Expand/Collapse Item3145            call    _getprocessor
It's pretty obvious what _getprocessor does, but I can't find where it's defined.  It returns 86 into ax for an 8086, 286 for a 80286, 386 for a 80386 and so on.  It's possible that _getprocessor is a function that's supplied by the compiler (like I believe that _edata and _end are variables supplied by the compiler) but I'm not sure.  What leads me to believe that it's a function supplied by the compilier is that this code calls two other functions that are not defined in this file (boot is defined in boot.c and printk is declared in minix/minlib.h ,which is #included in boot.c, and is part of the standard library) and both of these are declared as .extern on line 3052.  _getprocessor, _edata, and _end are neither defined nor declared in this file, suggesti