boothead.s
Skip to line: 3100 - 3200 - 3300 - 3400 - 3500 - 3600 - 3700 - 3800 - 3900 - 4000 - 4100 - 4200 - 4284


Highlighted entries were made in the last day
Select a different time increment to highlight entries
Current GMT time: Sep 16 2024 09:17:56

If you have a comment for boothead.s, please click here.
There was an error! 2002: php_network_getaddresses: getaddrinfo failed: Name or service not known
3000   !       Boothead.s - BIOS support for boot.c            Author: Kees J. Bot
3001   !
3002   !
3003   ! This file contains the startup and low level support for the secondary
3004   ! boot program.  It contains functions for disk, tty and keyboard I/O,
3005   ! copying memory to arbitrary locations, etc.
3006   !
Expand/Collapse Item3007    ! The primary bootstrap code supplies the following parameters in registers:
These are the same values that were passed into bootblock.s.
3008   !       dl      = Boot-device.
3009   !       es:si   = Partition table entry if hard disk.
3010   !
3011
3012   .define begtext, begdata, begbss
3013   .data
3014   begdata:
3015           .ascii  "(null)\0"      ! Just in case someone follows a null pointer
3016   .bss
3017 begbss:
3018
Expand/Collapse Item3019            o32         =     0x66  ! This assembler doesn't know 386 extensions
In Makefile, boothead.s is compiled with the -mi86 option (LD86 contains the mi86 option).  This option uses the machine instructions (mi) of the 8086 system which does not have 32-bit registers (like eax, ebx, etc.).  If an instruction is needed that uses a 32-bit value, the 8086 instruction must be prefixed with 0x66.

Look at line 3933.  If the -mi86 option is used and the retf instruction has no prefix, the instruction jumps to the address specified by the last 2 bytes on the stack (the offset) and the next-to-last 2 bytes on the stack (the segment).  However, if the last 4 bytes on the stack are the offset and the next-to-last 4 bytes on the stack are the segment and the -mi86 option is used, the instruction must be prefixed with 0x66.  On lines 3922-3925, these 8 bytes are pushed on the stack.

Expand/Collapse Item3020            BOOTOFF     =   0x7C00  ! 0x0000:BOOTOFF load a bootstrap here
Expand/Collapse Item3021            LOADSEG     =   0x1000  ! Where this code is loaded.
Expand/Collapse Item3022            BUFFER      =   0x0600  ! First free memory
The bootstrap (which is bootblock.s) loaded this code (the secondary boot loader) at address 0x1000:0x0000.  If the user wishes to boot a different partition, the bootstrap from that partition is loaded at address 0x0000:0x7C00 and the boot process repeats itself (the bootstrap loads the secondary boot loader which loads the kernel). masterboot.s and bootblock.s describe this process in greater detail.
Expand/Collapse Item3023            PENTRYSIZE  =       16  ! Partition table entry size.
Expand/Collapse Item3024            a_flags     =        2  ! From a.out.h, struct exec
In your book, look at line 01400.  This is the header file a.out.h.  The first thing declared in this file is the struct exec.  All minix executables (with a few exceptions like bootblock and masterboot - these 2 files must begin with executable code) begin with headers.

a_flags is at an offset of 2 bytes, a_text is at an offset of 8 bytes, and so on.  a_flags describes the kernel (with the options shown on lines 3029-3033) and a_text, a_data, a_bss, and a_total are sizes.

Note that the A_SEP flag describes this executable (the secondary boot loader) whereas the K_I386, K_RET, K_INT86, and K_MEML flags describe the kernel.

3025           a_text      =        8
3026           a_data      =       12
3027           a_bss       =       16
3028           a_total     =       24
Expand/Collapse Item3029            A_SEP       =     0x20  ! Separate I&D flag
Read section 4.7.1 and the first 10 paragraphs of section 4.7.3 of Operating Systems and try to understand as much as you can.  Some of the terminology may be unfamiliar so I will give a short description of the concepts involved.

This executable (the secondary boot) is compiled with the -mi86 option and runs in real mode and not in protected mode.  For this reason, the secondary boot is not be able to take advantage of the protection features of protected mode.  However, since this is the first time we've run into the A_SEP flag, it's a good place to discuss shared vs. separate segments.

In protected mode, the text (code) and the data+bss+heap+stack (I will refer to this as the total data - see the next paragraph for a description of each of these) in an executable with separate text and total data segments are protected from one another.  For example, if the code tries to jump to a memory address that's within the total data segment, the hardware triggers a segment violation.  If they're not separate (A_SEP in a_flags is not set), chaos results.  Another advantage of separating the text and total data is that the text can be shared among multiple instances of the same program.  The total data will differ between two instances of the same program but the text will be the same.

Data contains initialized global variables, bss contains uninitialized global variables and must be initialized to zero (see lines 3091-1098), and the heap is the memory that malloc() allocates at run-time.

It's best to also keep the data+bss+heap and the stack separate - although Minix doesn't separate the two for the reasons given in section 4.7.3.  This means that if the heap or the stack grows too large, one can overwrite the other.  If the stack overwrites the heap and the overwritten data is not accessed immediately, identifying the problem is difficult.

On disk, the a_text field in the header holds the size of the text and the a_data field holds the size of the data.  If the kernel doesn't have separate text and total data segments, the variables a_data and a_text are combined into a_data and the variable a_text is set to zero (see lines 3069-3071).  Note that even though the values are changed in memory, they do not affect the values on disk. a_bss is the size of the bss.  a_total is the size of the data+bss+heap+stack (separate) or the text+data+bss+heap+stack (shared).  Unlike a_text, it doesn't need to be modified if the text and total data are shared. a_total determines the top of the stack (see lines 3075-3077) and is also used (with a_text) to determine the global variable _runsize (see lines 3127-3135) which is needed by boot.c in initialize().

Expand/Collapse Item3030            K_I386      =   0x0001  ! Call Minix in 386 mode
If the K_I386 flag is set for the kernel, this code must switch to protected mode.
Expand/Collapse Item3031            K_RET       =   0x0020  ! Returns to the monitor on reboot
Look at lines 3936 and 3942.  The minix kernel returns there on a halt or reboot if the K_RET is set for the kernel.  If the K_RET flag is not set, the system simply halts.

3032           K_INT86     =   0x0040  ! Requires generic INT support

Expand/Collapse Item3033            K_MEML      =   0x0080  ! Pass a list of free memory
The variable _mem (see line 3048) is used to pass this memory list.  The int 0x12 (see line 3141) and int 0x15 (see lines 3152 and 3157) bios calls are used to determine the low memory and high memory size.

3034

Expand/Collapse Item3035            DS_SELECTOR =      3*8  ! Kernel data selector
Expand/Collapse Item3036            ES_SELECTOR =      4*8  ! Flat 4 Gb
Expand/Collapse Item3037            SS_SELECTOR =      5*8  ! Monitor stack
Expand/Collapse Item3038            CS_SELECTOR =      6*8  ! Kernel code
Expand/Collapse Item3039            MCS_SELECTOR=      7*8  ! Monitor code
To support multiprocessing, the 80286 and up use global descriptor tables (GDT's).  p_gdt (line 4242) is the descriptor table.  Anything that is labeled UNSET must be filled in before the global descriptor table is loaded using the lgdt instruction (see line 4133).  These values are filled in on lines 3871-3897.

The following values are the offsets of the entries within the global descriptor table.  For example, since the entry for the kernel code is the 7th entry (see line 4267) and the size of each entry is 8 bytes, its offset is 6*8 (remember that the first entry has a 0 offset).  The MCS_SELECTOR is pushed onto the stack (if the K_RET flag is set for the kernel) before jumping to the kernel (look at lines 3918-3920) .  Also before the jump is made to the kernel, the ds and es registers are loaded with DS_SELECTOR and ES_SELECTOR, respectively.

3040

Expand/Collapse Item3041            ESC         =     0x1B  ! Escape character
0x1B is the ascii representation of ESC.

3042

Expand/Collapse Item3043    ! Imported variables and functions:
Memory for a variable can be allocated in only one file (i.e. the variable is "defined") but the variable must be declared as extern in every other file that accesses it.  To accomplish this, the macro EXTERN is #defined as the empty string in boot.c .  This prevents the EXTERN macro from being #defined as extern in boot.h when boot.h is #included in boot.c.  boot.h is also #included in bootimage.c.  Since EXTERN is not #defined (and is therefore undefined), EXTERN is replaced by extern in bootimage.c.  This mechanism ensures that memory for a variable is allocated only once.

A similar trick is used in the kernel.  Read the 5th paragraph of section 2.6.3 of Operating Systems for details.

Variables that are shared between assembler and C code are prefixed with an underscore ( _ ) in the assembler code but are not prefixed with an underscore in the C code.

Expand/Collapse Item3044    .extern _caddr, _daddr, _runsize, _edata, _end  ! Runtime environment
_caddr is the absolute address of the first byte of the text. _daddr is the absolute address of the first byte of the data. _runsize is the size of the entire executable (text+data+bss+heap+stack). I believe that _edata and _end are variables that are generated by the compiler. _edata is the offset address of the end of the data and _end is the offset address of the end of the bss.  These two variables are used on lines 3091-3098.  See the comment for line 3145 for further discussion of _edata and _end.
3045   .extern _device                                 ! BIOS device number
3046   .extern _rem_part                               ! To pass partition info
Expand/Collapse Item3047    .extern _k_flags                                ! Special kernel flags
_k_flags contains the K_I386 , K_RET, K_INT86, and K_MEML flags (lines 3030-3033).  _k_flags is set in bootimage.c.
3048   .extern _mem                                    ! Free memory list
3049
3050   .text
3051   begtext:
Expand/Collapse Item3052    .extern _boot, _printk                          ! Boot Minix, kernel printf
These functions are defined in boot.c. boot is called on line 3180.
3053
Expand/Collapse Item3054    ! Set segment registers and stack pointer using the programs own header!
Expand/Collapse Item3055    ! The header is either 32 bytes (short form) or 48 bytes (long form).  The
Expand/Collapse Item3056    ! bootblock will jump to address 0x10030 in both cases, calling one of the
Expand/Collapse Item3057    ! two jmpf instructions below.
3058
3059           jmpf    boot, LOADSEG+3 ! Set cs right (skipping long a.out header)
3060           .space  11              ! jmpf + 11 = 16 bytes
3061           jmpf    boot, LOADSEG+2 ! Set cs right (skipping short a.out header)
Expand/Collapse Item3062    boot:
Whether this code has a short header or a long header, the second instruction executed (after the first jump) is at address boot.

Before boot is called on line 3180, a few things are done.  (Don't confuse the two boot's; one's an address (line 3062) and the other's a function defined in boot.c (line 3180).)

Lines 3062-3080: The ds, ss, and sp registers are loaded.  The values loaded depend on whether this executable has a separate text and total data (A_SEP in a_flags is set) or this executable has a shared text and total data (A_SEP in a_flags is not set).

Lines 3092-3097:  Clear the bss.  The bss contains uninitialized global variables and needs to be zeroized.

Lines 3100-3135:  Initialize various global variables so that when boot (line 3180) is called, the C code can access their values.

Lines 3137-3177:  Initialize the array mem[].

Expand/Collapse Item3063            mov     ax, #LOADSEG
Expand/Collapse Item3064            mov     ds, ax          ! ds = header
What's the pound (#) sign all about?  The pound sign indicates that the value of LOADOFF is moved into the register rather than the contents of the memory location LOADOFF.

Why can't the instruction mov ds, #LOADSEG be used instead of using ax as an intermediate register?  The 80x86 processors forbids immediate data to segment register transfers.  (Immediate data is data that is within the instruction itself, as opposed to data that is at a memory location or data that is in a register.)  Memory to segment register transfers are also forbidden.  Only register to segment register transfers are allowed.  The one exception to this rule is the cs register.  The cs register is even more restrictive.  The only two instructions that can alter the cs register are jmpf (far jump) and return retf (far return) instructions.

3065
3066           movb    al, a_flags
Expand/Collapse Item3067            testb   al, #A_SEP      ! Separate I&D?
Expand/Collapse Item3068            jnz     sepID
testb sets the zero flag if A_SEP is not set in a_flags.  If the zero flag is not set (A_SEP is set), then jnz jumps to sepID.
Expand/Collapse Item3069    comID:  xor     ax, ax
This instruction zeroes the ax register (any number xor'ed with itself is zero).  This is a pretty common practice.  The instruction

mov ax, #0

is slower and is 3 bytes compared with xor's 2 bytes.

3070           xchg    ax, a_text      ! No text
3071           add     a_data, ax      ! Treat all text as data
3072   sepID:
3073           mov     ax, a_total     ! Total nontext memory usage

Expand/Collapse Item3074            and     ax, #0xFFFE     ! Round down to even
I'm not sure why we do this.  However, since the size of the stack is arbitrary and there should be plenty of room to spare, rounding down to an even value shouldn't be a problem.  The efficiency of transferring 2 bytes from an even memory address may be greater than transferring 2 bytes from an odd memory address.

3075           mov     a_total, ax     ! total - text = data + bss + heap + stack

Expand/Collapse Item3076            cli                     ! Ignore interrupts while stack in limbovv
Whenever a value is moved into the stack register (ss) or the stack pointer (sp), the interrupts must be first disabled.  The ss and sp registers hold the address to which an interrupt returns after its completion.  If the ss and sp register are in flux, one can't predict where the code will return.

Interrupts are disabled with the cli (clear interrupts) instruction and reenabled with the sti (set interrupts) instruction.

3077           mov     sp, ax          ! Set sp at the top of all that
3078
Expand/Collapse Item3079            mov     ax, a_text      ! Determine offset of ds above cs
Expand/Collapse Item3080            movb    cl, #4
Expand/Collapse Item3081            shr     ax, cl
Expand/Collapse Item3082            mov     cx, cs
Expand/Collapse Item3083            add     ax, cx
Expand/Collapse Item3084            mov     ds, ax          ! ds = cs + text / 16
Each segment register (cs , ds, es , ss,etc.) is internally appended with a 0x0 before being added to a non-segment register (like ip or ax ) to form an address.  For example, if the cs register holds the value 0x1000 and the ip register holds the value 0x1000, then together these registers point to address 0x11000.  So if we wish to add an offset (in bytes) to a segment register, we must first shift the offset 4 bits to the right (line 3081).
3085           mov     ss, ax
3086           sti                     ! Stack ok now
Expand/Collapse Item3087            push    es              ! Save es, we need it for the partition table
This value is popped into the upper 2 bytes of _rem_part on line 3105.
3088           mov     es, ax
3089           cld                     ! C compiler wants UP
3090
3091   ! Clear bss
3092           xor     ax, ax          ! Zero
Expand/Collapse Item3093            mov     di, #_edata     ! Start of bss is at end of data
Expand/Collapse Item3094            mov     cx, #_end       ! End of bss (begin of heap)
_edata and _end are variables that are set by the compiler.  _edata is the offset address of the end of the data and_end is the offset address of the end of the bss.

3095           sub     cx, di          ! Number of bss bytes
3096           shr     cx, #1          ! Number of words

Expand/Collapse Item3097            rep
Expand/Collapse Item3098            stos                    ! Clear bss
The instruction prefix rep repeats the instruction (in this case stos) cx times.  stos stores ax at the memory address es:di.  Since stos stores words and not bytes, cx must be shifted to the right by 1 (in other words, divided by 2).
3099
Expand/Collapse Item3100    ! Copy primary boot parameters to variables.  (Can do this now that bss is
Expand/Collapse Item3101    ! cleared and may be written into).
Since _device and _rem_part are uninitialized global variables, they are stored in the bss.

3102           xorb    dh, dh
3103           mov     device, dx     ! Boot device (probably 0x00 or 0x80)
3104           mov     _rem_part+0, si ! Remote partition table offset
3105           pop     _rem_part+2     ! and segment (saved es)
3106
3107   ! Remember the current video mode for restoration on exit.
3108           movb    ah, #0x0F       ! Get current video mode

Expand/Collapse Item3109            int     0x10
There was an error! 2002: php_network_getaddresses: getaddrinfo failed: Name or service not known
int 0x10,ah=0x0F returns the current video mode into al.  Some examples of return values are al=0x13 (VGA, 320x2100 resolution, 256 colors), al=0x12 (VGA, 640x480, 16), and al=0x0E (CGA, 640x240, 16).

I don't know what "blanking" is.  If you know, please submit a comment to the site which will be displayed below.

3110           andb    al, #0x7F       ! Mask off bit 7 (no blanking)
3111           movb    old_vid_mode, al
3112           movb    cur_vid_mode, al
3113
3114   ! Give C code access to the code segment, data segment and the size of this
3115   ! process.
3116           xor     ax, ax
3117           mov     dx, cs
Expand/Collapse Item3118            call    seg2abs
Line 3222 converts a segment:offset address in dx:ax to an absolute address in dx-ax. Note that the notation dx-ax does not mean dx minus ax It means that the lower 2 bytes are in ax and the upper 2 bytes are in dx.  This notation is used in other places in the code (for example, lines 3227-3243).
3119           mov     _caddr+0, ax
3120           mov     _caddr+2, dx
3121           xor     ax, ax
3122           mov     dx, ds
3123           call    seg2abs
3124           mov     _daddr+0, ax
3125           mov     _daddr+2, dx
3126           push    ds
3127           mov     ax, #LOADSEG
3128           mov     ds, ax          ! Back to the header once more
3129           mov     ax, a_total+0
3130           mov     dx, a_total+2   ! dx:ax = data + bss + heap + stack
Expand/Collapse Item3131            add     ax, a_text+0
Expand/Collapse Item3132            adc     dx, a_text+2    ! dx:ax = text + data + bss + heap + stack
If this executable has a separate text and total data segment, a_text must be added to a_total to get the total size of the executable.  If it has a shared text and total data segment, a_total is the size of the text and the total data.  However, a_text was set to zero on line 3070 and can be added anyway and it won't matter.
3133           pop     ds
3134           mov     _runsize+0, ax
3135           mov     _runsize+2, dx  ! 32 bit size of this process
3136
Expand/Collapse Item3137    ! Determine available memory as a list of (base,size) pairs as follows:
Expand/Collapse Item3138    ! mem[0] = low memory, mem[1] = memory between 1M and 16M, mem[2] = memory
Expand/Collapse Item3139    ! above 16M.  Last two coalesced into mem[1] if adjacent.
The memory (base, size) pairs will look something like this:
mem[0]=(0x00000000, size of low memory)
mem[1]=(0x00100000, size of memory between 1M and 16M)
mem[2]=(0x01000000, size of memory greater than 16M)

If the mem[1] and mem[2] memory areas are continugous, then mem[1] and mem[2] are combined.

Since mem[] is an uninitialized variable, it is found in the bss space, which was zeroized on lines 3091-1098.  The following instructions are not needed since these 4 bytes are already zero.

mov 0(di), #0
mov 2(di), #0

Also, since the lower 2 bytes of the base of both mem[1] and mem[2] are also zero, the following instructions are also not needed:

mov 8(di), #0
mov 16(di), #0

The lower 2 bytes of the lower memory size are stored in 4(di) and the upper 2 bytes are stored in 6(di) (lines 3443-3144).  Likewise, 12(di) and 14(di) hold the size of the memory between 1M and 16M.  20(di) and 22(di) hold the size of the memory above 16M.  Since int 0x15 , ax=0xE081 returns the number of 64K (not 1K) blocks of memory in bx (see line 3152), 20(di) will equal 0.

3140           mov     di, #_mem       ! di = memory list
3141           int     0x12            ! Returns low memory size (in K) in ax
Expand/Collapse Item3142            mul     c1024
c1024 is a memory address (see line 4207).  "c" stands for constant.  mul multiples ax by the operand (in this case the value at the address specified by the operand) and puts the lower 2 bytes of the result in ax and the upper 2 bytes in dx.
3143          mov     4(di), ax       ! mem[0].size = low memory size in bytes
3144           mov     6(di), dx
Expand/Collapse Item3145            call    _getprocessor
There was an error! 2002: php_network_getaddresses: getaddrinfo failed: Name or service not known
It's pretty obvious what _getprocessor does, but I can't find where it's defined.  It returns 86 into ax for an 8086, 286 for a 80286, 386 for a 80386 and so on.  It's possible that _getprocessor is a function that's supplied by the compiler (like I believe that _edata and _end are variables supplied by the compiler) but I'm not sure.  What leads me to believe that it's a function supplied by the compilier is that this code calls two other functions that are not defined in this file (boot is defined in boot.c and printk is declared in minix/minlib.h ,which is #included in boot.c, and is part of the standard library) and both of these are declared as .extern on line 3052.  _getprocessor, _edata, and _end are neither defined nor declared in this file, suggesting that they are special in some way.  If you have any answers to this, please send an e-mail to feedback@swartzbaugh.net or submit a comment to the site which will be displayed below..
3146           cmp     ax, #286        ! Only 286s and above have extended memory
3147           jb      no_ext
3148           cmp     ax, #486        ! Assume 486s were the first to have >64M
3149           jb      small_ext       ! (It helps to be paranoid when using the BIOS)
3150   big_ext:
3151           mov     ax, #0xE801     ! Code for get memory size for >64M
Expand/Collapse Item3152            int     0x15            ! ax = mem at 1M per 1K, bx = mem at 16M per 64K
int 0x15 sets the carry flag if the value in ax (or ah) is not a supported input.
3153           jnc     got_ext
3154   small_ext:
3155           movb    ah, #0x88       ! Code for get extended memory size
3156           clc                     ! Carry will stay clear if call exists
3157           int     0x15            ! Returns size (in K) in ax for AT's
3158           jc      no_ext
3159           test    ax, ax          ! An AT with no extended memory?
3160           jz      no_ext
3161           xor     bx, bx          ! bx = mem above 16M per 64K = 0
3162   got_ext:
3163           mov     cx, ax          ! cx = copy of ext mem at 1M
3164           mov     10(di), #0x0010 ! mem[1].base = 0x00100000 (1M)
3165           mul     c1024
3166           mov     12(di), ax      ! mem[1].size = "ext mem at 1M" * 1024
3167           mov     14(di), dx
Expand/Collapse Item3168            test    bx, bx
If bx has any value other than 0, it was put there by int 0x15 on line 3152.

3169           jz      no_ext          ! No more ext mem above 16M?

Expand/Collapse Item3170            cmp     cx, #15*1024    ! Chunks adjacent? (precisely 15M at 1M?)
Expand/Collapse Item3171            je      adj_ext
If there are 15M between 1M and 16M, then the memory between 1M and 16M and the memory above 16M is contiguous.  If the memory is contiguous, the two sizes are combined into mem[1] by jumping to adj_ext.
3172           mov     18(di), #0x0100 ! mem[2].base = 0x01000000 (16M)
3173           mov     22(di), bx      ! mem[2].size = "ext mem at 16M" * 64K
3174           jmp     no_ext
3175   adj_ext:
3176           add     14(di), bx      ! Add ext mem above 16M to mem below 16M
3177   no_ext:
3178
Expand/Collapse Item3179    ! Time to switch to a higher level language (not much higher)
Now that we've taken care of a little housekeeping, we can call boot.
3180           call    _boot
Expand/Collapse Item3181    ! Time to switch to a higher level language (not much higher)
Until bootstrap (line 3786), the code is a little tedious.  Functions are defined that are called by the secondary boot's C code.  The one thing that makes it interesting is that it gives a little insight into when you can use C and when you have to use assembler.  Most of the assembler functions make a lot of calls to the bios.
3182   ! void ..exit(int status)
3183   !       Exit the monitor by rebooting the system.
Expand/Collapse Item3184    .define _exit, __exit, ___exit          ! Make various compilers happy
Expand/Collapse Item3185    _exit:
Expand/Collapse Item3186    __exit:
Expand/Collapse Item3187    ___exit:
There was an error! 2002: php_network_getaddresses: getaddrinfo failed: Name or service not known
_exit, __exit, and ___exit are all the same addresses.  I am not entirely sure why we need to match up with specific compilers, but I'm willing to make a guess.  My guess is that every compiler has a built-in exit function and in order to override the compiler default,  you have to supply your own exit function and give the function the appropriate compiler-specific name.  If anyone knows for sure, please submit a comment to the site which will be displayed below.

If the _exit function is called, then for one reason or another, the boot code has decided to exit rather than jump to the kernel.  If no error occured (status=0), reboot.  Otherwise, wait for a key to be pressed and then reboot.

Expand/Collapse Item3188            mov     bx, sp
Expand/Collapse Item3189            cmp     2(bx), #0               ! Good exit status?
This is something you'll see a lot so make sure you understand it.  When C code calls a function, it pushes its arguments onto the stack.  The C code pushes its last argument first and the first argument last (there's only one argument for the exit function - status).  After finishing with the arguments, the return address is pushed (since the function is exit and the system is rebooted, this value is never used but it's pushed onto the stack anyway).  The stack at this moment looks like this:

3190           jz      reboot
Expand/Collapse Item3191    quit:   mov     ax, #any_key
Expand/Collapse Item3192            push    ax
Expand/Collapse Item3193            call    _printk
We can now see an example of an assembler function calling a C function that requires parameters.  The printk() function (which has the same syntax as printf()) is very flexible - one way that it can be called is by passing it a pointer to a string as an argument.  In order to pass the C function a pointer as an argument, the address of the string (in this case, any_key) is pushed onto the stack (line 3192) before making the call (line 3193).  Only a single argument is passed to the C function, but if more than one argument were passed, the last argument would be pushed first and the first argument would be pushed last.
3194           xorb    ah, ah                  ! Read character from keyboard
Expand/Collapse Item3195            int     0x16
int 0x16, ah=0 waits for a key to be pressed and then puts the ascii code of the pressed key in al. Since we don't care what key is pressed, the contents of al is never examined.
Expand/Collapse Item3196    reboot: call    restore_video
There was an error! 2002: php_network_getaddresses: getaddrinfo failed: Name or service not known
I'm not sure why we need to restore the old video settings before rebooting, but I guess it can't hurt.  If you know why we do this, please submit a comment to the site which will be displayed below.
3197           int     0x19                    ! Reboot the system
Expand/Collapse Item3198    .data
Variables can be interspersed anywhere in the code with the .data declaration.  Since the address any_key is used on line 3191, it's convenient to place the .data declaration here.

3199   any_key:
3200           .ascii  "\nHit any key to reboot\n\0"
3201   .text
3202

Expand/Collapse Item3203    ! u32_t mon2abs(void *ptr)
Expand/Collapse Item3204    !       Address in monitor data to absolute address.
mon2abs converts a 2-byte offset address (using ds as the segment address) to a 4-byte absolute address.  ptr is this 2-byte offset.  vec2abs converts a 4-byte segment:offset address to a 4-byte absolute address.  vec points to this 4-byte address.  (Note that a segment:offset address is called a vector.)

Let's look in boot.c where both of these functions are called.  lowsec holds a 2-byte integer while rem_part holds a 2-byte offset address in its lower 2 bytes and a 2-byte segment address in its upper 2 bytes (see lines 3104-3105).  mon2abs converts the 2-byte offset address of lowsec (using ds as its 2-byte segment address) to a 4-byte absolute address and vec2abs converts the 4-byte segment:offset address found at rem_part to a 4-byte absolute address.  The figure below shows the relationship between the stack of vec2abs and the variable rem_part.

The comment for vec2abs is a little misleading.  As the above example shows, vec2abs can be used for conversions of segment:offset vectors that are not interrupt vectors.  This is the only place in the boot sequence where vec2abs is called.

3205   .define _mon2abs
3206   _mon2abs:
3207           mov     bx, sp
3208           mov     ax, 2(bx)       ! ptr
3209           mov     dx, ds          ! Monitor data segment
3210           jmp     seg2abs
3211
Expand/Collapse Item3212    ! u32_t vec2abs(vector *vec)
Expand/Collapse Item3213    !       8086 interrupt vector to absolute address.
As discussed in the comments for 3203-3204, vec2abs converts a 4-byte segment:offset address to a 4-byte absolute address.  vec points to this 4-byte address.

3214   .define _vec2abs
3215   _vec2abs:
3216           mov     bx, sp
3217           mov     bx, 2(bx)
3218           mov     ax, (bx)
3219           mov     dx, 2(bx)       ! dx:ax vector
3220           !jmp    seg2abs         ! Translate
3221

Expand/Collapse Item3222  seg2abs:                        ! Translate dx:ax to the 32 bit address dx-ax
The segment address must be shifted to the left 4 bits before being added to an offset address.  ch is used to store intermediate values.  Note that dx-ax does not mean dx minus ax.  It represents a 4-byte value with the upper 2 bytes in dx and the lower 2 bytes in ax.  To make sure that you understand the steps below, convert a segment:offset address (for example 0x0100:0x0116 = dx:ax) to its absolute address (0x00001116 = dx-ax) using the instructions below.
3223           push    cx
3224           movb    ch, dh
3225           movb    cl, #4
3226           shl     dx, cl
3227           shrb    ch, cl          ! ch-dx = dx << 4
3228           add     ax, dx
3229           adcb    ch, #0          ! ch-ax = ch-dx + ax
3230           movb    dl, ch
3231           xorb    dh, dh          ! dx-ax = ch-ax
3232           pop     cx
3233           ret
3234
Expand/Collapse Item3235  abs2seg:                        ! Translate the 32 bit address dx-ax to dx:ax
This is the reverse of below.  An absolute value in dx-ax is converted to a segment:offset address in dx:ax.  This operation would convert dx-ax = 0x00001116 to dx:ax = 0x0111:0x0006.  Note that the 2 segment:offset addresses 0x0100:0x0116 and 0x0111:0x0006 are the same absolute address.
3236           push    cx
3237           movb    ch, dl
3238           mov     dx, ax          ! ch-dx = dx-ax
3239           and     ax, #0x000F     ! Offset in ax
3240           movb    cl, #4
3241           shr     dx, cl
3242           shlb    ch, cl
3243           orb     dh, ch          ! dx = ch-dx >> 4
3244           pop     cx
3245           ret
3246
Expand/Collapse Item3247    ! void raw_copy(u32_t dstaddr, u32_t srcaddr, u32_t count)
Expand/Collapse Item3248    !       Copy count bytes from srcaddr to dstaddr.  Don't do overlaps.
Expand/Collapse Item3249    !       Also handles copying words to or from extended memory.
The most difficult part of this function is dealing with extended memory. 

If the source or destination address is greater than 1MB, extended memory must be accessed using the int 0x15, ah=0x87 bios call.

Keep in mind that the system is still in real mode.  When (and if) the system is switched to protected mode, this problem goes away.  In protected mode, 4GB of memory can be accessed.

If the absolute address range 0x1000-0x2000 were copied to location 0x1800-0x2800, the source range and the destination range would overlap.  Overlaps are not allowed.

After line 3253, the stack will look like this:

3250   .define _raw_copy
3251   _raw_copy:
3252           push    bp
3253           mov     bp, sp
3254           push    si
3255           push    di              ! Save C variable registers
3256   copy:
3257           cmp     14(bp), #0
3258           jnz     bigcopy
3259           mov     cx, 12(bp)
3260           jcxz    copydone        ! Count is zero, end copy
3261           cmp     cx, #0xFFF0
3262           jb      smallcopy
3263   bigcopy:mov     cx, #0xFFF0     ! Don't copy more than about 64K at once
3264   smallcopy:
3265           push    cx              ! Save copying count
3266           mov     ax, 4(bp)
3267           mov     dx, 6(bp)

Expand/Collapse Item3268            cmp     dx, #0x0010     ! Copy to extended memory?
Expand/Collapse Item3269            jae     ext_copy
Expand/Collapse Item3270            cmp     10(bp), #0x0010 ! Copy from extended memory?
Expand/Collapse Item3271            jae     ext_copy
If either the source or destination address is greater than 0x00100000 (1MB), then the int 0x15, ah=0x87 bios call must be used.
Expand/Collapse Item3272            call    abs2seg
In order to use the rep movs instruction, the source and destination addresses must be converted from absolute addresses to segment:offset addresses. abs2seg converts the absolute address dx-ax to the segment:offset address dx:ax.  (Note that dx-ax does not mean dx minus ax.  It represents a 4-byte value whose upper 2 bytes are in dx and whose lower 2 bytes are in ax.)
3273           mov     di, ax
3274           mov     es, dx          ! es:di = dstaddr
3275           mov     ax, 8(bp)
3276           mov     dx, 10(bp)
3277           call    abs2seg
3278           mov     si, ax
3279           mov     ds, dx          ! ds:si = srcaddr
Expand/Collapse Item3280            shr     cx, #1          ! Words to move
Expand/Collapse Item3281            rep
rep movs copies cx words from ds:si to es:di.  Since cx holds the number of bytes and words being copied, cx must be shifted to the right 1 bit (this divides cx by 2).
3282           movs                    ! Do the word copy
Expand/Collapse Item3283            adc     cx, cx          ! One more byte?
Expand/Collapse Item3284            rep
Expand/Collapse Item3285            movsb                   ! Do the byte copy
The shr instruction on line 3280 shifted the right-most bit into the carry flag.  If cx were previously odd, then the carry flag will be set and if cx were previously even, the carry flag will not be set.  On line 3283, cx (which was decremented to 0 by the rep movs instruction) is added to itself (for a total of 0) and then the carry flag is added.  So cx will be 1 if it had been odd before the shr instruction and it will be 0 if it been even before the shr instruction.

Either 1 byte or 0 bytes is then copied from ds:si to es:si.

3286           mov     ax, ss          ! Restore ds and es from the remaining ss
3287           mov     ds, ax
3288           mov     es, ax
3289           jmp     copyadjust
Expand/Collapse Item3290    ext_copy:
Look at line 4214.  The UNSET's for both x_src_desc and x_dst_desc must be modified with the lowest addresses (bases) of the source and destination addresses before making the int 0x15, ah=0x87 bios call.  None of the other UNSET's in the table matters here.
3291           mov     x_dst_desc+2, ax
3292           movb    x_dst_desc+4, dl ! Set base of destination segment
3293           mov     ax, 8(bp)
3294           mov     dx, 10(bp)
3295           mov     x_src_desc+2, ax
3296           movb    x_src_desc+4, dl ! Set base of source segment
3297           mov     si, #x_gdt      ! es:si = global descriptor table
3298           shr     cx, #1          ! Words to move
3299           movb    ah, #0x87       ! Code for extended memory move
Expand/Collapse Item3300            int     0x15
For the int 0x15, ah=0x87 bios call, es:si points to the extended move table (line 4214) and cx holds the number of words to copy.
Expand/Collapse Item3301    copyadjust:
The stack contents are modified in order to advance the current source and destination addresses and keep track of how many more bytes must be copied (see line 3263).
3302           pop     cx              ! Restore count
3303           add     4(bp), cx
3304           adc     6(bp), #0       ! srcaddr += copycount
3305           add     8(bp), cx
3306           adc     10(bp), #0      ! dstaddr += copycount
3307           sub     12(bp), cx
3308           sbb     14(bp), #0      ! count -= copycount
3309           jmp     copy            ! and repeat
Expand/Collapse Item3301    copydone:
At this point, the copying is complete.  bp, di, and si were pushed onto the stack (lines 3252, 3254, and 3255) and must be popped before returning.
3311           pop     di
3312           pop     si              ! Restore C variable registers
3313           pop     bp
3314           ret
3315
Expand/Collapse Item3316    ! u16_t get_word(u32_t addr);
Expand/Collapse Item3317    ! void put_word(u32_t addr, u16_t word);
Expand/Collapse Item3318    !       Read or write a 16 bits word at an arbitrary location.
u16_t get_word(u32_t addr) returns the 2 byte (16 bit) value at absolute memory address addr.
3319   .define _get_word, _put_word
3320   _get_word:
3321           mov     bx, sp
3322           call    gp_getaddr
3323           mov     ax, (bx)        ! Word to get from addr
3324           jmp     gp_ret
3325   _put_word:
3326           mov     bx, sp
Expand/Collapse Item3327            push    6(bx)           ! Word to store at addr
Expand/Collapse Item3328            call    gp_getaddr
Expand/Collapse Item3329            pop     (bx)            ! Store the word
The value pushed on line 3327 is the same value popped on line 3329.  This value is word.  It is pushed to the location specified by ds:bx, which was set on lines 3335-3336.
3330           jmp     gp_ret
Expand/Collapse Item3331  gp_getaddr:
"gp" stands for get/put.  gp_getaddr converts addr (found on the stack) to a segment:offset address in ds:bx.
3332           mov     ax, 2(bx)
3333           mov     dx, 4(bx)
3334           call    abs2seg
3335           mov     bx, ax
3336           mov     ds, dx          ! ds:bx = addr
3337           ret
3338   gp_ret:
3339           push    es
Expand/Collapse Item3340            pop     ds              ! Restore ds
The value of ds was changed on line 3336.  es and ds (before line 3336) were the same value.
3341           ret
3342
Expand/Collapse Item3343    ! void relocate(void);
Expand/Collapse Item3344    !       After the program has copied itself to a safer place, it needs to change
Expand/Collapse Item3345    !       the segment registers.  Caddr has already been set to the new location.
It's slightly more complicated than this, but in the initialize() function in boot.c the secondary boot (this program) is copied to the end of the available low memory and a jump is made to it.

The return address (which is only an offset) of this function is popped into bx to start this function.  The last instruction before the final instruction, retf, is to push this offset on the stack.  The returning offset will be the same but the segment will be different (the new segment is pushed on the stack on line 3364).

3346   .define _relocate
3347   _relocate:
3348           pop     bx              ! Return address
3349           mov     ax, _caddr+0
3350           mov     dx, _caddr+2
3351           call    abs2seg
Expand/Collapse Item3352            mov     cx, dx          ! cx = new code segment
Expand/Collapse Item3353            mov     ax, cs          ! Old code segment
Expand/Collapse Item3354            sub     ax, cx          ! ax = -(new - old) = -Moving offset
Expand/Collapse Item3355            mov     dx, ds
Expand/Collapse Item3356            sub     dx, ax
The difference between the new code segment (cs) and the old code segment will be the same as the difference between the new data segment (ds) and the old data segment.
Expand/Collapse Item3357            mov     ds, dx          ! ds += (new - old)
Expand/Collapse Item3358            mov     es, dx
Expand/Collapse Item3359            mov     ss, dx
ds, es, and ss are set to the new value using the mov instruction.  The code segment (cs) and the instruction pointer (ip) can be changed by a jump instruction or a retf instruction but not by a mov instruction.
3360           xor     ax, ax
3361           call    seg2abs
3362           mov     _daddr+0, ax
3363           mov     _daddr+2, dx    ! New data address
3364           push    cx              ! New text segment
3365           push    bx              ! Return offset of this function
3366           retf                    ! Relocate
3367
Expand/Collapse Item3368    ! void *brk(void *addr)
Expand/Collapse Item3369    ! void *sbrk(size_t incr)
Expand/Collapse Item3370    !       Cannot fail implementations of brk(2) and sbrk(3), so we can use
Expand/Collapse Item3371    !       malloc(3).  They reboot on stack collision instead of returning -1.
There was an error! 2002: php_network_getaddresses: getaddrinfo failed: Name or service not known
Neither brk nor sbrk is found in boot.c, bootimage.c, or rawfs.c.  On the other hand, in boot.c and bootimage.c we call malloc().  I believe that malloc() calls this function, brk or sbrk, to determine if there's enough room to allocate the requested space on the heap.  Remember that the stack and the heap can collide with each other, as discussed in section 4.7.3 of Operating Systems .  Here's a simple figure of the memory layout.

If you can shed any light on the brk and sbrk calls, please submit a comment to the site which will be displayed below.

brk specifies the address on the heap (addr) to which malloc() wishes to allocate.  sbrk specifies the incremental space on the heap (incr) that malloc() wishes to allocate.

3372   .data
3373           .align  2
Expand/Collapse Item3374    break:  .data2  _end            ! A fake heap pointer
This is the top of the heap.  As discussed in the comments on 3044, _end is the initial top of the heap and is supplied by the compiler.  This value is modified on line 3387.
3375   .text
3376   .define _brk, __brk, _sbrk, __sbrk
3377   _brk:
3378   __brk:                          ! __brk is for the standard C compiler
3379           xor     ax, ax
3380           jmp     sbrk            ! break= 0; return sbrk(addr);
3381   _sbrk:
3382   __sbrk:
3383           mov     ax, break       ! ax= current break
3384   sbrk:   push    ax              ! save it as future return value
Expand/Collapse Item3385            mov     bx, sp          ! Stack is now: (retval, retaddr, incr, ...)
Expand/Collapse Item3386            add     ax, 4(bx)       ! ax= break + increment
Expand/Collapse Item3387            mov     break, ax       ! Set new break
If brk was called, the stack will be: (retval, retaddr, addr, ...)  Line 3386 either adds the value at address break and incr (for sbrk) or adds the value of addr and 0 (for brk).  Either way, the new top of the heap is stored at address break on line 3387.
Expand/Collapse Item3388            lea     dx, -1024(bx)   ! sp minus a bit of breathing space
The lea instruction loads a 16-bit register (in this case, dx) with the offset address of the data specified by the second operand (in this case, sp - 1024; sp=bx, see line 3385).  The lea instruction and the mov instruction are similar but not the same.  The mov instruction would have loaded dx with the data specified by the second operand, not the offset address of the data specified.
Expand/Collapse Item3389            cmp     dx, ax          ! Compare with the new break
If 1K (1024) doesn't separate the stack from the new top of the heap, there are major problems and the system is rebooted.  If less than 4K (but greater than 1K) separate the two, a warning is issued.
3390           jb      heaperr         ! Suffocating noises
3391           lea     dx, -4096(bx)   ! A warning when heap+stack goes < 4K
3392           cmp     dx, ax
3393           jae     plenty          ! No reason to complain
3394           mov     ax, #memwarn
3395           push    ax
3396           call    _printk         ! Warn about memory running low
3397           pop     ax
Expand/Collapse Item3398            movb    memwarn, #0     ! No more warnings
The user is warned once but not twice.
3399   plenty: pop     ax              ! Return old break (0 for brk)
3400           ret
3401   heaperr:mov     ax, #chmem
3402           push    ax
3403           mov     ax, #nomem
3404           push    ax
3405           call    _printk
3406           jmp     quit
3407 .data
Expand/Collapse Item