This post was originally published here
My journey on learning to build a simple OS
How many times have you read an OS book but not been able to code one?Operating System (OS) books are tedious, but only theory makes it hard to understand how an OS actually works. Here is my attempt to write a simple OS and document some of the concepts learned.
Before You Start
On a mac, install Homebrew and then brew install qemu nasm
On some systems qemu is split into multiple binaries. You may want to call qemu-system-x86_64 binfile
For testing these low-level programs without continuously having to reboot a machine or risk scrubbing your important data off a disk, we will use a CPU emulator QEmu.
I'm working on a Mac (with M1 chip). QEmu has some issues with M1 chip, so you can run these experiements inside a docker container.
docker run -it ubuntu bash.
Run QEmu with -nographic and -curses arguments inside docker container to display the VGA output when in text mode
NASM is an assembler and disassembler for the Intel x86 architecture. It can be used to write 16-bit, 32-bit (IA-32) and 64-bit (x86-64) programs.
When we start our computer, initially, it has no notion of an operating system. Somehow, it must load the operating system --- whatever variant that may be --- from some permanent storage device that is currently attached to the computer (e.g. a floppy disk, a hard disk, a USB dongle, etc.).
The Boot Process
Booting an operating system consists of transferring control along a chain of small programs, each one more “powerful” than the previous one, where the operating system is the last “program”.
When the PC is turned on, the computer will start a small program that adheres to the Basic Input Output System (BIOS)  standard. This program is usually stored on a read only memory chip on the motherboard of the PC. BIOS is a collection of software routines that are initially loaded from a chip into memory and initialised when the computer is switched on. BIOS provides auto-detection and basic control of your computer’s essential devices, such as the screen, keyboard, and hard disks.
Note: Modern operating systems do not use the BIOS’ functions, they use drivers that interact directly with the hardware, bypassing the BIOS. Today, BIOS mainly runs some early diagnostics (power-on-self-test) and then transfers control to the bootloader.
BIOS cannot simply load a file that represents your operating system from a disk, since BIOS has no notion of a file- system. BIOS must read specific sectors of data (usually 512 bytes in size) from specific physical locations of the disk devices, such as Cylinder 2, Head 3, Sector 5.
So, the easiest place for BIOS to find our OS is in the first sector of one of the disks (i.e. Cylinder 0, Head 0, Sector 0), known as the boot sector. To make sure that the "disk is bootable", the BIOS checks that bytes 511 and 512 of the alleged boot sector are bytes 0xAA55. If so, the BIOS loads the first sector to the address 7C00h, set the program counter to that address and let the CPU executing code from there. This is the simplest boot sector ever:
e9 fd ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 29 more lines with sixteen zero-bytes each ] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa
Note that, in the above boot sector, the three important features are:
1) The initial three bytes, in hexadecimal as 0xe9, 0xfd and 0xff, are actually machine code instructions, as defined by the CPU manufacturer, to perform an endless jump.
2) The last two bytes, 0x55 and 0xaa, make up the magic number, which tells BIOS that this is indeed a boot block and not just data that happens to be on a drive’s boot sector. (in little-endian format)
3) The file is padded with zeros (’*’ indicates zeros omitted for brevity), basically to position the magic BIOS number at the end of the 512 byte disk sector.
The first sector is called Master Boot Record, or MBR. The program in the first sector is called MBR Bootloader.
So, BIOS loops through each storage device (e.g. floppy drive, hard disk, CD drive, etc.), reads the boot sector into memory, and instructs the CPU to begin executing the first boot sector it finds that ends with the magic number. This is where we seize control of the computer.
The BIOS program will transfer control of the PC to a program called a bootloader. A bootloader loads an OS, or an application that runs and communicate directly with hardware. To run an OS, the first thing to write is a bootloader. Here is a simple bootloader.
; ; A simple boot sector program that loops forever. ; 9 ; Define a label, "loop", that will allow ; us to jump back to it, forever. ; Use a simple CPU instruction that jumps ; to a new memory address to continue execution. ; In our case, jump to the address of the current ; instruction. loop: jmp loop ; When compiled, our program must fit into 512 bytes, ; with the last two bytes being the magic number, ; so here, tell our assembly compiler to pad out our ; program with enough zero bytes (db 0) to bring us to the ; 510th byte. times 510-($-$$) db 0 ; Last two bytes (one word) form the magic number, ; so BIOS knows we are a boot sector. dw 0xaa55
We compile the code with nasm and write it to a bin file:
nasm -f bin boot_sect_simple.asm -o boot_sect_simple.bin
Let's try it out, so let's do it:
On some systems, you may have to run
You will see a window open which says "Booting from Hard Disk..." and nothing else. There you go, a simple boot loader is ready!
Continue reading more here
Sorry, copy pasting it from ghost was tough!
Top comments (3)
Nice article! However, please ensure that you properly attribute the original author of some portions of your guide:
On the article you linked, there's some code from the same guide that I believe you copied from github.com/cfenollosa/os-tutorial/... too.
Correctly adding attribution helps highlight and credit the original authors whose work may have saved you time when writing this article. It also helps others to find references to the original article, as well. Thanks!
The link which you've posted on GitHub is a summary by cfenollosa of a book and not his code. The code I've written is a simple boot sector which is built on interrupts, something which you can find in various books, and is something basic.
Yes, I will reference the textbooks I am referring to in the upcoming post for further research.
I see! Didn't know that, my bad. I was just checking; attributions are quite important tbh. Good to hear that you'll add references to posts in the future 👍
Some comments have been hidden by the post's author - find out more