|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
boothead.s |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Skip to line: 3100 - 3200 - 3300 - 3400 - 3500 - 3600 - 3700 - 3800 - 3900 - 4000 - 4100 - 4200 - 4284 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| If you have a comment for boothead.s, please click here. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
| These are the same values that were passed into bootblock.s. | ||
|
|
| In Makefile, boothead.s is compiled
with the -mi86 option (LD86 contains the mi86 option). This option
uses the machine instructions (mi) of the 8086 system which does not have
32-bit registers (like eax, ebx, etc.). If an instruction is needed
that uses a 32-bit value, the 8086 instruction must be prefixed with 0x66.
Look at line 3933. If the -mi86 option is used and the retf instruction has no prefix, the instruction jumps to the address specified by the last 2 bytes on the stack (the offset) and the next-to-last 2 bytes on the stack (the segment). However, if the last 4 bytes on the stack are the offset and the next-to-last 4 bytes on the stack are the segment and the -mi86 option is used, the instruction must be prefixed with 0x66. On lines 3922-3925, these 8 bytes are pushed on the stack. | ||
|
|
|
|
|
|
| The bootstrap (which is bootblock.s) loaded this code (the secondary boot loader) at address 0x1000:0x0000. If the user wishes to boot a different partition, the bootstrap from that partition is loaded at address 0x0000:0x7C00 and the boot process repeats itself (the bootstrap loads the secondary boot loader which loads the kernel). masterboot.s and bootblock.s describe this process in greater detail. | ||
|
|
| ||
|
|
|
In your book, look at line 01400. This is the header file a.out.h.
The first thing declared in this file is the struct exec.
All minix executables (with a few exceptions like bootblock and masterboot
- these 2 files must begin with executable code) begin with headers.
a_flags is at an offset of 2 bytes, a_text is at an offset of 8 bytes, and so on. a_flags describes the kernel (with the options shown on lines 3029-3033) and a_text, a_data, a_bss, and a_total are sizes. Note that the A_SEP flag describes this executable (the secondary boot loader) whereas the K_I386, K_RET, K_INT86, and K_MEML flags describe the kernel. | ||
|
|
|
Read section 4.7.1 and the first 10 paragraphs of section 4.7.3 of Operating
Systems and try to understand as much as you can. Some of the
terminology may be unfamiliar so I will give a short description of the
concepts involved.
This executable (the secondary boot) is compiled with the -mi86 option and runs in real mode and not in protected mode. For this reason, the secondary boot is not be able to take advantage of the protection features of protected mode. However, since this is the first time we've run into the A_SEP flag, it's a good place to discuss shared vs. separate segments. In protected mode, the text (code) and the data+bss+heap+stack (I will refer to this as the total data - see the next paragraph for a description of each of these) in an executable with separate text and total data segments are protected from one another. For example, if the code tries to jump to a memory address that's within the total data segment, the hardware triggers a segment violation. If they're not separate (A_SEP in a_flags is not set), chaos results. Another advantage of separating the text and total data is that the text can be shared among multiple instances of the same program. The total data will differ between two instances of the same program but the text will be the same. Data contains initialized global variables, bss contains uninitialized global variables and must be initialized to zero (see lines 3091-1098), and the heap is the memory that malloc() allocates at run-time. It's best to also keep the data+bss+heap and the stack separate - although Minix doesn't separate the two for the reasons given in section 4.7.3. This means that if the heap or the stack grows too large, one can overwrite the other. If the stack overwrites the heap and the overwritten data is not accessed immediately, identifying the problem is difficult. On disk, the a_text field in the header holds the size of the text and the a_data field holds the size of the data. If the kernel doesn't have separate text and total data segments, the variables a_data and a_text are combined into a_data and the variable a_text is set to zero (see lines 3069-3071). Note that even though the values are changed in memory, they do not affect the values on disk. a_bss is the size of the bss. a_total is the size of the data+bss+heap+stack (separate) or the text+data+bss+heap+stack (shared). Unlike a_text, it doesn't need to be modified if the text and total data are shared. a_total determines the top of the stack (see lines 3075-3077) and is also used (with a_text) to determine the global variable _runsize (see lines 3127-3135) which is needed by boot.c in initialize(). | ||
|
|
| If the K_I386 flag is set for the kernel, this code must switch to protected mode. | ||
|
|
| Look at lines 3936 and 3942. The minix kernel returns there on a halt or reboot if the K_RET is set for the kernel. If the K_RET flag is not set, the system simply halts. | ||
|
3032 K_INT86 = 0x0040 ! Requires generic INT support |
|
|
| The variable _mem (see line 3048) is used to pass this memory list. The int 0x12 (see line 3141) and int 0x15 (see lines 3152 and 3157) bios calls are used to determine the low memory and high memory size. | ||
|
3034 |
|
|
|
|
|
|
|
|
|
|
|
To support multiprocessing, the 80286 and up use global descriptor
tables (GDT's). p_gdt (line 4242) is the descriptor table.
Anything that is labeled UNSET must be filled in before the global
descriptor table is loaded using the lgdt instruction (see line
4133). These values are filled in on lines 3871-3897.
The following values are the offsets of the entries within the global descriptor table. For example, since the entry for the kernel code is the 7th entry (see line 4267) and the size of each entry is 8 bytes, its offset is 6*8 (remember that the first entry has a 0 offset). The MCS_SELECTOR is pushed onto the stack (if the K_RET flag is set for the kernel) before jumping to the kernel (look at lines 3918-3920) . Also before the jump is made to the kernel, the ds and es registers are loaded with DS_SELECTOR and ES_SELECTOR, respectively. | ||
|
3040 |
|
|
| 0x1B is the ascii representation of ESC. | ||
|
3042 |
|
|
| Memory for a variable can be allocated in only one file (i.e. the variable
is "defined") but the variable must be declared as extern in every
other file that accesses it. To accomplish this, the macro EXTERN
is #defined as the empty string in boot.c
. This prevents the EXTERN macro from being #defined
as extern in boot.h when boot.h
is #included in boot.c. boot.h is also #included
in bootimage.c. Since EXTERN is not #defined (and
is therefore undefined), EXTERN is replaced by extern
in bootimage.c. This mechanism ensures that memory for a variable
is allocated only once.
A similar trick is used in the kernel. Read the 5th paragraph of section 2.6.3 of Operating Systems for details. Variables that are shared between assembler and C code are prefixed with an underscore ( _ ) in the assembler code but are not prefixed with an underscore in the C code. | ||
| _caddr is the absolute address of the first byte of the text. _daddr is the absolute address of the first byte of the data. _runsize is the size of the entire executable (text+data+bss+heap+stack). I believe that _edata and _end are variables that are generated by the compiler. _edata is the offset address of the end of the data and _end is the offset address of the end of the bss. These two variables are used on lines 3091-3098. See the comment for line 3145 for further discussion of _edata and _end. | ||
|
|
| _k_flags contains the K_I386 , K_RET, K_INT86, and K_MEML flags (lines 3030-3033). _k_flags is set in bootimage.c. | ||
|
3048 .extern _mem
! Free memory list
3049 3050 .text 3051 begtext: |
|
|
| These functions are defined in boot.c. boot is called on line 3180. | ||
| 3053 |
|
|
|
|
|
|
|
|
| ||
|
3058
3059 jmpf boot, LOADSEG+3 ! Set cs right (skipping long a.out header) 3060 .space 11 ! jmpf + 11 = 16 bytes 3061 jmpf boot, LOADSEG+2 ! Set cs right (skipping short a.out header) |
|
|
|
Whether this code has a short header or a long header, the second instruction
executed (after the first jump) is at address boot.
Before boot is called on line 3180, a few things are done. (Don't confuse the two boot's; one's an address (line 3062) and the other's a function defined in boot.c (line 3180).) Lines 3062-3080: The ds, ss, and sp registers are loaded. The values loaded depend on whether this executable has a separate text and total data (A_SEP in a_flags is set) or this executable has a shared text and total data (A_SEP in a_flags is not set). Lines 3092-3097: Clear the bss. The bss contains uninitialized global variables and needs to be zeroized. Lines 3100-3135: Initialize various global variables so that when boot (line 3180) is called, the C code can access their values. Lines 3137-3177: Initialize the array mem[]. | ||
|
|
|
What's the pound (#) sign all about? The pound sign indicates
that the value of LOADOFF is moved into the register
rather than the contents of the memory location LOADOFF.
Why can't the instruction mov ds, #LOADSEG be used instead of using ax as an intermediate register? The 80x86 processors forbids immediate data to segment register transfers. (Immediate data is data that is within the instruction itself, as opposed to data that is at a memory location or data that is in a register.) Memory to segment register transfers are also forbidden. Only register to segment register transfers are allowed. The one exception to this rule is the cs register. The cs register is even more restrictive. The only two instructions that can alter the cs register are jmpf (far jump) and return retf (far return) instructions. | ||
|
3065
3066 movb al, a_flags |
|
|
| testb sets the zero flag if A_SEP is not set in a_flags. If the zero flag is not set (A_SEP is set), then jnz jumps to sepID. | ||
|
|
| This instruction zeroes the ax register (any number xor'ed
with itself is zero). This is a pretty common practice. The
instruction
mov ax, #0 is slower and is 3 bytes compared with xor's 2 bytes. | ||
|
3070
xchg ax, a_text
! No text
|
|
|
| I'm not sure why we do this. However, since the size of the stack is arbitrary and there should be plenty of room to spare, rounding down to an even value shouldn't be a problem. The efficiency of transferring 2 bytes from an even memory address may be greater than transferring 2 bytes from an odd memory address. | ||
|
3075 mov a_total, ax ! total - text = data + bss + heap + stack |
|
|
|
Whenever a value is moved into the stack register (ss)
or the stack pointer (sp), the interrupts must be first disabled.
The ss and sp registers hold the address to which an
interrupt returns after its completion. If the ss and
sp
register are in flux, one can't predict where the code will return.
Interrupts are disabled with the cli (clear interrupts) instruction and reenabled with the sti (set interrupts) instruction. | ||
|
3077
mov sp, ax
! Set sp at the top of all that
3078 |
|
|
|
|
|
|
|
|
|
|
| Each segment register (cs , ds, es , ss,etc.) is internally appended with a 0x0 before being added to a non-segment register (like ip or ax ) to form an address. For example, if the cs register holds the value 0x1000 and the ip register holds the value 0x1000, then together these registers point to address 0x11000. So if we wish to add an offset (in bytes) to a segment register, we must first shift the offset 4 bits to the right (line 3081). | ||
|
3085
mov ss, ax
3086 sti ! Stack ok now |
|
|
| This value is popped into the upper 2 bytes of _rem_part on line 3105. | ||
|
3088
mov es, ax
3089 cld ! C compiler wants UP 3090 3091 ! Clear bss 3092 xor ax, ax ! Zero |
|
|
|
|
| _edata and _end are variables that are set by the compiler. _edata is the offset address of the end of the data and_end is the offset address of the end of the bss. | ||
|
3095
sub cx, di
! Number of bss bytes
|
|
|
|
|
| The instruction prefix rep repeats the instruction (in this case stos) cx times. stos stores ax at the memory address es:di. Since stos stores words and not bytes, cx must be shifted to the right by 1 (in other words, divided by 2). | ||
|
|
|
|
| Since _device and _rem_part are uninitialized global variables, they are stored in the bss. | ||
|
3102
xorb dh, dh
|
|
|
| int 0x10,ah=0x0F returns the current video mode into
al. Some examples of return values are al=0x13 (VGA, 320x2100
resolution, 256 colors), al=0x12 (VGA, 640x480, 16), and al=0x0E
(CGA, 640x240, 16).
I don't know what "blanking" is. If you know, please submit a comment to the site which will be displayed below. |
||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||
|
3110
andb al, #0x7F !
Mask off bit 7 (no blanking)
3111 movb old_vid_mode, al 3112 movb cur_vid_mode, al 3113 3114 ! Give C code access to the code segment, data segment and the size of this 3115 ! process. 3116 xor ax, ax 3117 mov dx, cs |
| Line 3222 converts a segment:offset address in dx:ax to an absolute address in dx-ax. Note that the notation dx-ax does not mean dx minus ax. It means that the lower 2 bytes are in ax and the upper 2 bytes are in dx. This notation is used in other places in the code (for example, lines 3227-3243). | ||
|
3119
mov _caddr+0,
ax
3120 mov _caddr+2, dx 3121 xor ax, ax 3122 mov dx, ds 3123 call seg2abs 3124 mov _daddr+0, ax 3125 mov _daddr+2, dx 3126 push ds 3127 mov ax, #LOADSEG 3128 mov ds, ax ! Back to the header once more 3129 mov ax, a_total+0 3130 mov dx, a_total+2 ! dx:ax = data + bss + heap + stack |
| If this executable has a separate text and total data segment, a_text must be added to a_total to get the total size of the executable. If it has a shared text and total data segment, a_total is the size of the text and the total data. However, a_text was set to zero on line 3070 and can be added anyway and it won't matter. | ||
|
|
|
|
|
|
| The memory (base, size) pairs will look something like this:
mem[0]=(0x00000000, size of low memory) mem[1]=(0x00100000, size of memory between 1M and 16M) mem[2]=(0x01000000, size of memory greater than 16M) If the mem[1] and mem[2] memory areas are continugous, then mem[1] and mem[2] are combined. Since mem[] is an uninitialized variable, it is found in the bss space, which was zeroized on lines 3091-1098. The following instructions are not needed since these 4 bytes are already zero. mov 0(di), #0
Also, since the lower 2 bytes of the base of both mem[1] and mem[2] are also zero, the following instructions are also not needed: mov 8(di), #0
The lower 2 bytes of the lower memory size are stored in 4(di) and the upper 2 bytes are stored in 6(di) (lines 3443-3144). Likewise, 12(di) and 14(di) hold the size of the memory between 1M and 16M. 20(di) and 22(di) hold the size of the memory above 16M. Since int 0x15 , ax=0xE081 returns the number of 64K (not 1K) blocks of memory in bx (see line 3152), 20(di) will equal 0.
| ||
|
3140
mov di, #_mem
! di = memory list
3141 int 0x12 ! Returns low memory size (in K) in ax |
| c1024 is a memory address (see line 4207). "c" stands for constant. mul multiples ax by the operand (in this case the value at the address specified by the operand) and puts the lower 2 bytes of the result in ax and the upper 2 bytes in dx. | ||
|
3143 mov
4(di), ax ! mem[0].size = low memory
size in bytes
3144 mov 6(di), dx |
|
|
| It's pretty obvious what _getprocessor does, but I can't find where it's defined. It returns 86 into ax for an 8086, 286 for a 80286, 386 for a 80386 and so on. It's possible that _getprocessor is a function that's supplied by the compiler (like I believe that _edata and _end are variables supplied by the compiler) but I'm not sure. What leads me to believe that it's a function supplied by the compilier is that this code calls two other functions that are not defined in this file (boot is defined in boot.c and printk is declared in minix/minlib.h ,which is #included in boot.c, and is part of the standard library) and both of these are declared as .extern on line 3052. _getprocessor, _edata, and _end are neither defined nor declared in this file, suggesti | |||||