GitHub repository: https://github.com/DonaldKellett/marvelos
It's been just over a month since I started writing an operating system kernel from scratch for the RISC-V architecture, specifically the QEMU RISC-V
virt board. Well, kind of. I've been following The Adventures of OS closely, porting the Rust code to C along the way, making incremental improvements to the code and re-organizing the project structure as I see fit. At the time of writing (2022-10-19), I've implemented a round-robin scheduler that juggles around 3 copies of the same user process hardcoded within the kernel ad infinitum, which is something, but still some ways before I can load and execute an actual user program from disk, and definitely a long way to go before the system becomes usable, such as spawning an interactive command line shell.
In this article, I'm going to present my initial motivations and goals for embarking on this project, background knowledge I found indispensable, stuff I learned while working on the project and pointers to resources along the way. If you're also interested in RISC-V and operating systems from a practical perspective, and undecided on whether to give OSDev a try, then this article is for you!
The main reasons I decided to embark on this journey are:
- To learn more about the RISC-V architecture from a programmer's perspective
- To learn how an OS kernel works, inside out, from a practical standpoint
To be able to brag to my colleagues that I managed to write an OS kernel from scratch
I found the following areas of knowledge indispensable for getting started with OSDev on RISC-V:
- A solid theoretical understanding of computer architecture and operating systems, through courses I took in my undergraduate Computer Science curriculum. An alternative is to search for related undergraduate textbooks often available online at no cost and self-study them at your own pace:
- Computer architecture: Introduction to MIPS Assembly Language Programming
- Operating systems: Operating Systems: Three Easy Pieces
- Familiarity with the Linux command line: best if you have at least 2 years of professional experience working with the Linux command line, or alternatively, study for and pass a performance-based Linux exam such as LFCS like I did (RHCSA will also do)
- A solid grasp of at least one systems programming language such as C, C++ or Rust. In particular, if going for Rust, reading and completing most, if not all of the exercises in The Rust Programming Language is strongly recommended
- Familiarity with compiling software from source and build systems. If not already familiar with an existing build system such as Make or Ninja, the best way to familiarize yourself with them is to simply build and install a bunch of software from source, such as giving Linux From Scratch (LFS) a go - by the time you manage to complete LFS, you'll certainly be able to recite
configure; make; make installblindfolded ;-)
Apart from the specific topics presented within "The Adventures of OS" which I've "completed" up to and including chapter 8 at the time of writing, here are some of the main objectives I've learned and/or got the chance to practice along the way:
- The roles of stack, frame and global pointers, and how to initialize them in assembly so we can jump into a higher-level language like C or Rust early on. RISC-V from scratch 2: Hardware layouts, linker scripts, and C runtimes contains a detailed explanation of this, among a few other things
- How programs, whether a userspace program or an OS kernel, are typically laid out in memory, and how to construct a linker script accordingly. Again, "RISC-V from scratch 2" proved tremendously useful for my initial understanding, though it does this by modifying an auto-generated linker script from
ld --verbose. This is fine as a first step to get your Hello World kernel up and running, but could devolve into an obstacle as you attempt to develop your OS kernel further. In any case, it's strongly recommended to rewrite the entire linker script from scratch as soon as possible so as to have complete control over it and a complete understanding of how it works. Some resources I consulted to realize this transition were:
- How to automate building and running the project using
make; in particular, leveraging variables in the
Makefileto elegantly apply the same command line options for compiling each file in the codebase - because trust me, you'll need a ton of command-line options ;-) For this, I based my initial Makefile on that found in the source code for "The Adventures of OS", e.g. this
- The device tree specification, how to generate a DTS file with QEMU and how to read it. Once again, "RISC-V from scratch 2" provides a nice introduction, and the rest can be learned by reading the official specification, which really isn't that long - you can go through the whole thing in an afternoon if you stay focused. Understanding DTS is crucial for figuring out why, for example, to power off the QEMU RISC-V
virtboard, you need to write the value
0x5555as a 32-bit integer to the memory address
- Why and how to organize a project at this scale into multiple files grouped into subdirectories based on their individual functions and categories; for example, I placed the code related to paging and virtual memory under
src/mm/, that related to processes and scheduling under
- I got the chance to properly explore macros in C; in particular, function-like macros and variadic macros, though I admit I might have abused them in some places like handling the platform-level interrupt controller (PLIC) (-:
- How to format code with automated tools like GNU indent so the coding style is consistent throughout the codebase and eliminates weird indentation due to accidental mixing of tabs and spaces etc. (ewwwww!!!) I also got to practice Linux commands like
findto search for all source code files and apply formatting to them automatically,
sedto work around an issue with binary literals and integrating those commands into my Makefile as a
formattarget so I don't have to memorize and type those commands manually
- Perhaps most importantly, how useful a debugger like GDB could be and how it could be a life-saver, if you know how to use one properly. I've never been a fan of debuggers and have been a proponent of
printfdebugging for as long as I have been programming. As Brian Kernighan once concluded in "Unix for Beginners", "The most effective debugging tool is still careful thought, coupled with judiciously placed print statements." And that may be true of most programming scenarios, where it is trivial to insert
printfstatements (or equivalent) around the offending code and see what gets printed out almost immediately by running the code once again. But when your kernel unexpectedly hangs in the middle of some operation for one of countless possible reasons, you may not have the luxury to inspect the system state by printing stuff to the console, since you might not even know where the program execution has jumped to! In that case, the only plausible manner to determine what exactly happened is to hook your system up to the debugger and step through the code line by line, instruction by instruction, inspecting values that you might not otherwise have been able to print to the console along the way (such as the values of specific registers controlling whether you can print any stuff to the console at all!). "RISC-V from scratch 2" covers the very basics of using GDB, and a GDB quick reference could be useful for more advanced GDB usage (or just Google stuff such as "how to print the corresponding function for an instruction address in GDB" along the way)
Writing an OS kernel from scratch is no easy task - a solid foundation in computer science is indispensable, you need to be comfortable with the command line and proficient in at least one systems programming language, care must be taken to organize the project in a sustainable manner, etc.
Still, with the required background and the aid of well-written tutorials often available online at no cost, plus the dedication to get to the bottom of every little detail involved and the persistence to keep working on the project bit by bit, feature by feature, it is feasible to get up to a usable system (for some definition of "usable") within a reasonable timeframe, such as a few months or years.
If you happen to share an interest in RISC-V and operating systems from a practical perspective and undecided on whether to take the plunge, this article could hopefully serve as a reference for making an informed decision on whether OSDev is for you. Finally, recall that the largest OSDev community with the most comprehensive variety of resources is over at the OSDev wiki, so do give that a look if you decide to embark on your journey.
Stay tuned for more articles by @donaldsebleung ;-)