Beka Modebadze

Posted on Aug 13, 2021 • Edited on Jun 4, 2022

Getting Started with Systems Programming with Rust (Part 1)

#rust #linux #systems #programming

You can find original post on my personal blog.

A modern computer is a very complex creation that evolved into the current state through decades of research and development. Sometimes it appears to be like black magic. There’s no magic in it, just science. However, some of the minds like Alan Turing, Charles Babbage, Ada Lovelace, John von Neumann, and many others are magical, as they made this possible.

Ok, that’s enough of introductions and let us dive into the fundamentals of systems programming. In this part we’ll learn:

- What is the process?

- How are they created and executed?

- Look at some code examples in Rust and compare them to C.

Before diving into code we’ll start to build up from the lowest level of the main components of the operating systems. As shown in Figure 1-a the lowest level of any computer is Hardware, next comes the Kernel mode which runs on bare metal. This is where the operating system, like Linux, is located.

Figure 1-a.

On top of the Kernel mode, we have a User-mode. For a user to be able to interact with the kernel AND use other higher-level software, like web browser, E-mail reader, etc. it requires a user interface program. This can be a window, Graphical User Interface, or it can be a shell which is a command interpreted that is used to read commands from a terminal and execute them

Processes: Parent and Child

The main concept in all operating systems is a process. A process is basically a running program. You can think of it as a drawer that contains all the information about that particular program. Some processes start running at the start of the computer, some run in the background, and some are called and interacted by the user, through the shell, for example.

All the processes have an id. The very first process is initiated, when the system is booted. This process has an id of 1 and is called init. After that, init will call other processes and so on. When we type a command in a shell for the OS to execute, the system should create a new process that will run the compiler. When the process has finished compiling, it will make a system call to terminate itself.

In UNIX systems every new process is a child process of some parent process. Process creation is done by cloning a parent process, which is referred to as forking (Figure 1-b). Each process has one parent but can have multiple child processes. The structure of the processes resembles a tree, where init is the root, meaning it’s at the top of the hierarchy.

After the process’s creation, the parent and the child processes are the same, except the parent will have some arbitrary ID number, and the child process will have an ID equal to 0. Next, the system substitutes the child process’s execution with a new program. When the process is done fulfilling its purpose, it’s terminated and exited normally (voluntary). The process can also be exited due to an error or killed by another process (involuntary).

Figure 1-b.

The system also keeps track of all the processes, maintaining their data in what’s called a processes table. It holds information like process id, process owner, process priority, environment variables for each process, the parent process. In addition to that, it also holds the info in what state a particular process is. Each process can be in one of the following four states:

RUNNABLE — The process is running / actively using the CPU.

SLEEPING — The process is runnable, but is waiting for another process to stop/finish first.

STOPPED — This state indicates that the process has been suspended for further proceeding. It can be restarted to run again by a signal.

ZOMBIE — The process is terminated when ‘system exit’ is called or someone else kills the process. However, the process has not been removed from the process table.

Often processes have to interact with each other and can change the state and go from Running to sleeping, then back to running (Figure 1-c). This is usually done by a SIGSTOP signal, which is issued by Ctrl + Z (We’ll review signals in-depth in upcoming parts). Same with the stopped process, its activity can be restarted. Except for the Zombie state, which once killed can’t be restarted or continued.

Figure 1-c.

C vs Rust

In C, which is an official Linux kernel programming language, process creation is done first by forking the new process and then explicitly asking a system to execute a new directive on a child process. If we don’t do that, both parent and child processes will be executing the same directive. Here is an example of executing ls command, which lists files of given directory:

#include stdio.h
#include sys/types.h
#include sys/wait.h

int main()
{
    pid_t pid;
    switch (pid = fork()) {
        case -1:
            perror("fork failed");
            break;
        case 0:
            printf("I'm child process and I will execute ls command");
            char *argv_list[] = {NULL};
            if (execv("ls", argv_list) == -1) {
                perror("Error in execve");
                exit(EXIT_FAILURE);
            }
            break;
        default:
            printf("I'm parent process and I'll just print this");
        }

    return 0;
}

As you can see we have to manage the processes manually and monitor if the execution was successful. Also, we have to handle errors. If we want a command to be executed only by a child we have to manually check if the current process is a child, which is done here by case 0. In Rust, the same can be achieved with a standard library’s process module:

use std::process::Command;

fn main() {
    let child = Command::new("ls")
                .env("PATH", "/bin")
                .output()
                .expect("failed to execute process");

    // if no error, program will continue..
}

Here Command::new()is a process builder which is responsible for spawning and handling a child process. Just like in a C code, we supply a command we want to execute, environmental variables, command argument, and call output method on it. The output will execute the command as a child process, waiting for it to finish, and returns the collected output.

Instead of output() we also have options to use either status() or spawn(). Each of these methods is responsible for forking a child process with subtle differences:

output()

Output

status()

ExitStatus

spawn()

Child

wait

kill

Here, env() is optional, as the Command is smart enough to look for the path of a /bin folder. Finally, all the error handling is done by expect(). It unwraps the result if Ok, meaning the program was executed successfully, or Err if something went wrong and will panic!. If you want your program not to terminate if Err encountered you can do something like this:


use std::process::Command;

main() {
    let user_input = get_user_input(); // custom function
    if let Err(_) = Command::new(&amp;user_input)
                            .envs("PATH", "/bin")
                            .status() {
        println!("{}: command not found!", &amp;cmd);
    }
    // the rest of the program...
}

Here status() is handier and calling it will return Ok if the legit command is supplied by the user and execute. But we are only interested in handling if the unavailable command was supplied. That’s why we only check if Err was returned, and if so print that “command was not found” into the terminal and continue the current program execution, instead of terminating.

Finally, the spawn() is used to manage the order of execution between several children and parent processes. It contains stdin stdout and stderr fields and has, familiar to C programmers, wait() , kill() and id() methods. We’ll look at this part of the processes in the next part and we’ll also see how Rust takes care of race conditions when two or more threads can access shared data and they try to change it at the same time.

Summary

In this introductory part, we reviewed what are processes, how they are created and compared Rust's implementation of the processes creation and command execution to C. We saw that Rust code not only is less prone to human errors but it’s less verbose and more concise. In the next parts, we’ll take a look at managing processes execution time and states, and handling system signals

Oldest comments (1)

Patrick O'Dacre • May 16 '22

I enjoyed reading this, thank you. Just a heads up, the link to your blog is incorrect.