Note: This is moved from my old blog.
When writing your own real mode boot strap, you have to deal with addresses correctly. My friends Zhuoqun and I have cleared some problems encountered, and I will note them here. Note that this is my understanding, which seems correct. But if you think that something is wrong, please leave a comment. I appreciate your help to make the blog more helpful.
What Happened During Boot
Let's see a simplified version.
- BIOS load your boot sector (the very first sector of your hard disk, 512 bytes, ended with
- CS, DS, ES, SS are set to zero, and CPU start executing from
CS:IP = 0000:7c00
What Kind of Addresses Do We Have
In a programming world, there are so many kinds of addresses. We have to make them consistent in order to make the program work.
When we write code, we have something called program counter (or location counter). That is the address within the context of text section in a program image, starting from zero. For example, we generate a helloworld.exe or helloworld.out, that is a program image which contains header and text section, and many other sections. We have a string constant located at program counter
0x0a is the offset from the beginning of the text section. If I do a object dump of the image, I will get that string at exactly the
0x0a position of that image. However, if I do a binary dump, I will get the string at
0x0a plus the offset address of text section. (Normally, the text section begins after the file header.) In the followings, I will assume that the programs are flat binary images, which have no headers unless explicitly stated.
However, when that image gets loaded into somewhere, say
0x7c00, I will never find my string at
0x0a. Instead, I will find it at
0x7c00 + 0x0a = 0x7c0a. This is quite straightforward, because the program counter is the offset from the beginning of the image. My image is loaded at
0x7c00, so the string should be at
0x7c0a. This address,
0x7c00, is called Load Memory Address, LMA.
Actually, there is a VMA, Virtual Memory Address, which is the actual address the program runs. It is not commonly seen for me because most of the time, VMA equals LMA. But when you are writing ROM code, the LMA will be in the ROM, while the VMA points to somewhere in the RAM because the ROM gets mapped into some virtual memory region.
What Does LD Do
Suppose I have multiple assembly source code files, and I want a single executable file. I will first assemble them into object files, and link them into a single executable file.
In the object files, every defined symbol is addressed using a program counter within that object file, and undefined symbols leave unchanged. When they linked together into a single executable file, LD will translate their own program counters into a new program counter regarding the new output file. Those undefined symbols in one object file, may found their definitions in other object files. LD will do the search job and translate them into correct program counters.
Also, LD will add some headers to the the output executable file. The header contains a lot of information to tell the program loader about how to load the image into memory and where is the entry point, and much more. The program loader is inside operating systems. When you double click an icon, Windows will use a loader to load it into memory, and run. There is one thing very important for LD to setup, that is the LMA.
After LMA is specified, LD will offset constant address values in the image.
This will not change the layout of the program image. For some output format, like "binary" and "pe", it will not affect the actual physical layout of the program image. However, for others, like "elf32-i386", it will also change the physical layout of the program image. For those not changed, if I find my string at the
0x0a byte of the image with LMA =
0x1000, I can still find it at the
0x0a byte when LMA =
0x7c00. For those changed, I will find my string at the
0x100a byte of the image with LMA =
0x1000. In a word, LD will link objects into one executable image, and resolve addresses with the specified LMA, and change physical layout according to the output format. However, when you try to access the string in assembly code, you should pay enough attention. LD will translate the address referenced in an operand from
0x7c0a with the help of LMA.
Let's see some code. (On my machine, compiler and linker will generate PE format files. Linker won't change the physical layout when I change LMA.)
.code16 .global _start _start: movw $msg, %ax leaw msg, %ax msg: .ascii "Hello World!"
And this is the way you assemble, and link it. The option -Ttext
0x0000 says that the LMA should be
objcopy performs a job that copies only the text section into a new binary format file.
as -o helloworld.o helloworld.s ld -e _start -Ttext 0x0000 -o helloworld.out helloworld.o objcopy -O binary -j .text -o helloworld.out helloworld.img
After the first step, we get an object file helloworld.o. Do
objdump -D helloworld.o to see the disassembled code. You will get something like
00000000 <_start>: movw $0007, %ax leaw ($0007), %ax 00000007 <msg>: ...
This means, the
_start lable is located at program counter
0x00000000, which is the start of text section. Our
msg is located at
0x00000007, and we can see that assembler already translate operands into its actual program counters.
objdump -x helloworld.o and you will see
.... Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000014 00000000 00000000 0000008c 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE ....
This means, our text section is of size
0x14, it will be loaded into LMA
0x00000000, the section actually starts at
0x8c in our program file image. That is to say, the header may be of size
0x8c. We know that our
msg is at
0x07, so in the real physical file, it should be at
0x07 + 0x8c = 0x93.
dump helloworld.o, and you will find "Hello World!" at
00000090 1e07 0048 656c 6c6f 2057 6f72 6c64 2190 ...Hello World!.
Now, let's do the same thing for the linked file, helloworld.out.
objdump -x helloworld.out Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000024 00007c00 00007c00 00000200 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE objdump -D helloworld.out 00007c00 <_start>: movw $7c07, %ax leaw ($7c07), %ax 00007c07 <msg>: ... dump helloworld.out 00000200 b807 7c8d 1e07 7c48 656c 6c6f 2057 6f72 8.|...|Hello Wor 00000210 6c64 2190 ffff ffff 0000 0000 ffff ffff ld!.............
You can obviously see that the operands for movw and leaw are changed according to LMA. And the program counter listed in front of
_start are changed, too. The section description tell us that the text section starts at the offset of
0x200 in the physical program image, which seems that the header size increased. We can still find the message string at
0x07 + 0x200 = 0x207.
LEA, MOV, CALL
When using move instruction like above, it will move the immediate address value (translated by assembler and linker) into register. It is actually the program counter. When using load effective address instruction, it will translate the program counter into effective address using DS:OFFSET where offset equals that program counter. If your DS is zero, then LEA happens to be the same as MOV. But in most cases, DS and other section registers are not zero.
Call instruction is another thing. I found that, for (maybe) near call, it is translated into a relative address, like
call -0x05. So it has nothing to do with LMA. No matter where the program gets loaded, it will always call a correct address using simple relative offset. But I am not sure with long calls.
So, when writing boot strap, one should pay enough attention to addresses. Another thing is that, there should be no headers, because CPU will jump to
0x7c00 and start executing directly. There isn't any program loader who reads header information and load the entry code into memory anymore.