What is process
Process is just a substance of program. Program is a like image containing a set of machine language instructions and some data, which is stored on the disk. You can check processes on your machine using the command "ps". This is the example of the "ps" command running on the WSL of Ubuntu 20.04.
$ ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 17:23 ? 00:00:00 /init
root 5 1 0 17:23 tty1 00:00:00 /init
watson 6 5 0 17:23 tty1 00:00:00 -bash
watson 111 6 0 19:23 tty1 00:00:00 ps -ef
As you can see, "ps" command is also a process.
All linux distributions have "init" process whose process id is 1. Init process is the first process to run on the machine although the process 0 has already ran in the kernel, to be exact. So all process starts from init process and it is said to be the parents of all processes.
Processes have not only PID but also PPID meaning parents process id. Processed can be represent as a tree structure. In this case, the structure is as below.
There are two inits process because I am using WSL. WSL uses its own init process that is different from linux's. WLS needs to serve 9p server to enble windows to access on the file system of linux. This task is in the same init binary but runs as a different process. That is why there are some init processes when using WSL.
Memory Architecture
This is the overview of virtual memory and executable image. Different UNIX systems use different layouts for processes, but for the most part, most modern systems adhere to a format known as the Executable and Linkable Format (ELF).
Kernel space
Process structure
The kernel maintains process structure for every process. This structure contains the information that the kernel needs to manage the process. The information set it is containing differs among linux distributions or versions but most distributions have the information below.
・Process id
・Parent process id (or pointer to parent's process structure)
・Pointer to list of children of the process
・Process priority for scheduling, statistics about CPU usage and last priority
・Process state
・Signal information (signals pending, signal mask, etc.)
・Machine state
・Timers
User structure
The user structure maintains far less information than the kernel structure. In Linux, it contains the memory maps and the process control block. The memory maps generally include the starting and ending addresses of the text, data, and stack segments, the various base and limit registers for the rest of the address space, and so on. The process control block contains the CPU state and virtual memory state.
Kernel stack
Each process has its own kernel stack. All functions in the kernel space is carefully designed so that they are non-recursive because recursive functions use a lot of stack spaces. The max possible size of stack can be determined by tracing the function chain. So the kernel stack is allocated in a fixed size.
User space
Text segment
The text segment is the programs's executable code. This segment is read-only and shared with any processed.
Stack segment
The stack segment storage for the return address of function, arguments of function and so on. If the stack meets the top of the heap, it causes an exception.
Data segment
Data segment maintains the initialized data and uninitialized data. Initialized data has starting value and its name coming from symbol table. Uninitialized data doesn't have its value so it has only the offset of data segment. The data segment grows or shrinks by explicit memory requests such as brk() system call. malloc() is brk() related function in C.
Type of ELF
There are 3 types of ELF. We can get it with the process of compile.
1 - Object file (*.o)
This is the binary holding code and data for linking with other object files to create executable. This is like a part of executable so it cannot executed by itself.
2 - Executable
This is the binary which can be executed. All object files needed to execute a program are linked and executable is generated. *.a file is a static library which is archive of some object files. When the program needs static library, linker links it when compiling. If you don't define the name of output executable, the name will be a.out meaning assembler output.
3 - Shared object file (*.so)
This is known as a dynamic linking library. When the program needs it, it will be linked when executing.
Top comments (0)