Writing Boot Sector Code
source link: https://susam.in/blog/writing-boot-sector-code/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Writing Boot Sector Code
Introduction
In this article, we discuss how to write our own
"hello, world"
program into the boot sector. At the time of
this writing, most such code examples available on the web were meant
for the Netwide Assembler (NASM). Very little material was available
that could be tried with the readily available GNU tools like the GNU
assembler (as) and the GNU linker (ld). This article is an effort to
fill this gap.
Boot Sector
When the computer starts, the processor starts executing instructions at the memory address 0xfff0. This is usually a location in the BIOS ROM. Thus the BIOS code is executed by the processor. It checks several things, does many tests including POST (power-on self test), and then finds the boot device. It loads the code from its boot sector into the memory and executes it. From here, the code in the boot sector takes control. In IBM-compatible PCs, the boot sector is the first sector of a data storage device. This is 512 bytes in length. The following table shows what the boot sector contains.
Address Description Size in bytes HexDec 0000Code440 1b8440Optional disk signature4 1bc4440x00002 1be446 Four 16-byte entries for primary partitions64 1fe5100xaa552
This type of boot sector found in IBM-compatible PCs is also known as master boot record (MBR). The next two sections explain how to write executable code into the boot sector. Two programs are discussed in the these two sections: one that merely prints a character and another that prints a string.
The reader is expected to have a working knowledge of x86 assembly language programming using GNU assembler. The details of assembly language won't be discussed here. Only how to write code for boot sector will be discussed.
The code examples were verified by using the following tools while writing this article:
- GNU assembler (GNU Binutils for Debian) 2.18
- GNU ld (GNU Binutils for Debian) 2.18
- dd (coreutils) 5.97
- DOSBox 0.72
Print Character
The following code prints a single character in yellow color on a blue background:
.code16
.section .text
.globl _start
_start:
mov $0xb800, %ax
mov %ax, %ds
movb $'A', 0
movb $0x1e, 1
idle:
jmp idle
We save the above code in a file, say char.s
, then assemble
and link this code with the following commands:
as -o char.o char.s ld --oformat binary -o char.com char.o
The .code16
directive tells the assembler that this code is
meant for 16-bit mode. The _start
label is meant to tell
the linker that this is the entry point in the program.
The video memory of the VGA is mapped to various segments between 0xa000 and 0xc000 in the main memory. The color text mode is mapped to the segment 0xb800. The first two instructions move 0xb800 into the data segment register, so that any data offsets specified is an offset in this segment. Then, the code for the character 'A' (usually 0x41 or 65) is moved into the first location in this segment and the attribute (0x1e) of this character to the second location. The higher nibble (0x1) is the attribute for background color and the lower nibble (0xe) is that of the foreground color. The highest bit of each nibble is the intensifier bit. The other three bits represent red, green, and blue. This is represented in a tabular form below.
AttributeBackgroundForegroundIRGB IRGB0001 11100x10xe
We can be see from the table that the background color is dark blue and
the foreground color is bright yellow. We compile and link the code with
the as
and ld
commands mentioned earlier and
generate an executable binary consisting of machine code.
Before writing the executable binary into the boot sector, we might want
to verify whether the code works correctly with an emulator. DOSBox is a
pretty good emulator for this purpose. It is available as the
dosbox
package in Debian. Rename the binary file to
char.com
and then run it with DOSBox with the following
commands:
dosbox -c cls char.com
The letter A
printed in yellow on a blue foreground should
appear in the first column of the first row of the screen.
In the ld
command earlier to generate the executable
binary, we used the extension name com
for the binary file
to make DOSBox believe that it is a DOS COM file, i.e., merely machine
code and data with no headers. In fact, the --oformat
binary
option in the ld
command was meant for
generating a binary with merely machine code and data without any
headers. This is why we are able to run the binary with DOSBox for
verification. If we do not use DOSBOX, any extension name or no
extension name for the binary would suffice.
Once we are satisfied with the output of char.com
running
in DOSBox,we write the binary and the MBR signature into the boot
sector with these commands:
dd if=char of=/dev/sdb printf '\x55\xaa' | dd seek=510 bs=1 of=/dev/sdb
Caution: One needs to be absolutely sure of the device path of the
device being written to. The device path /dev/sdb
is only
an example here. If the dd
command is used to write to the
wrong device, access to the data on it would be lost.
Now booting the computer with this device should show display the letter
A
in yellow on a blue background.
Print String
The following code prints a string in yellow color on a blue background:
.code16
.section .data
message:
.asciz "hello, world"
.section .text
.globl _start
_start:
nop
xor %di, %di
mov $0xb800, %ax
mov %ax, %ds
mov $message, %si
move:
xor %dx, %dx
mov %cs:(%si), %dl
cmp $0, %dl
idle:
jz idle
mov %dl, (%di)
inc %di
movb $0x1e, (%di)
inc %di
inc %si
jmp move
There are two sections in this code. The data section has the
null-terminated string to be displayed. The text section has the code.
The code moves the first byte of the string to the location,
0xb800:0x0000, its attribute to 0xb800:0x0001, the second byte of the
string to 0xb800:0x0002, its attribute to 0xb800:0x0003 and so on until
the string terminates which is detected by the null byte in the end. The
statement movb %cs:(%si), %dl
moves one character from the
string indexed by the SI register in the code segment into the DL
register. The reason why we are reading the characters from code segment
will become clear after understanding the the linker commands discussed
below.
While booting, the BIOS reads the code from the first sector of the boot
device into the memory at physical address 0x7c00 and jumps to that
address. However, while testing with DOSBox, things are a little
different. In DOS, the text section is loaded at an offset 0x100 in the
code segment. This should be specified to the linker while linking so
that it can correctly resolve the value of the label named
message
. Therefore the object file has to be linked twice:
once for testing it with DOSBox and once again before writing it into
the boot sector.
To understand the offset at which the data section can be put, it is worth looking at how the binary code looks like with a trial linking with the following command:
as -o string.o string.s
ld --oformat binary -Ttext 0 -Tdata 100 -o string.com string.o
objdump -bbinary -mi8086 -D string.com
xxd -g1 string.com
The -Ttext 0
option tells the linker to assume that the
text section should be loaded at offset 0x0 in the code segment.
Similarly, the -Tdata 100
tells the linker to assume that
the data section is at offset 0x100.
The objdump
command is used to disassemble the file. This
shows where the text section and data section are placed. Let us take a
close look at this portion of the output:
1b: 47 inc %di
1c: 46 inc %si
1d: eb ec jmp 0xb
...
ff: 00 68 65 add %ch,0x65(%bx,%si)
102: 6c insb (%dx),%es:(%di)
103: 6c insb (%dx),%es:(%di)
This portion shows the end of the text section and beginning of the data section.
The output of the xxd
command mentioned above looks like
this (repeated sequence of zeros have been replaced with
...
by me for the sake of brevity):
00000000: 90 31 ff b8 00 b8 8e d8 be 00 01 31 d2 2e 8a 14 .1.........1....
00000010: 80 fa 00 74 fe 88 15 47 c6 05 1e 47 46 eb ec 00 ...t...G...GF...
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
...
000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000100: 68 65 6c 6c 6f 2c 20 77 6f 72 6c 64 00 hello, world.
Both outputs above show that the text section occupies the first 0x1e bytes (30 bytes). The data section is 0xd bytes (13 bytes) in length. We have 0x1bc bytes (440 bytes) in the boot sector where we can put our binary. To fit the entire binary into the first 440 bytes, let us create a binary where the region from offset 0x0 to offset 0x1e contains the text section and the region from offset 0x20 to offset 0x2c contains the data section. The byte at offset 0x1f is going to remain unused. The total length of the binary would then be 0x2d bytes (45 bytes). We will create a new binary as per this plan.
However while creating the new binary, we should remember that DOS would
load the binary at offset 0x100, so we need to tell the linker to assume
0x100 as the offset of the text section and 0x120 as the offset of the
data section, so that it resolves the value of the label named
message
accordingly. We create a new binary in this manner
and test it with DOSBox with these commands:
ld --oformat binary -Ttext 100 -Tdata 120 -o string.com string.o
dosbox -c cls string.com
If everything looks fine, we link it once again for boot sector and write it to the boot sector of our device.
ld --oformat binary -Ttext 7c00 -Tdata 7c20 -o string string.o
dd if=string of=/dev/sdb
printf '\x55\xaa' | dd seek=510 bs=1 of=/dev/sdb
Caution: Again, one needs to be very careful with the dd
commands here. The device path /dev/sdb
is only an example.
This path must be changed to the path of the actual device one wants to
write the boot sector binary to.
Once written to the device successfully, the computer may be booted with
this device to display the "hello, world"
string on the
screen.
Home Feed Dark About GitHub Twitter
© 2006–2020 Susam Pal
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK