In the previous post, the concept of Linux capabilities was introduced. In this post, I will be exploring the capability sets and capability bits in a bit more detail. This is a prelude to future posts that will examine the practical use cases of capabilities in systemd, dockerd and fork/execve.
Capabilities are properties of threads (or processes). They have thread-level granularity. Applications also have a concept of capabilities and this will be explored separately in more depth when we investigate the execve use cases. A thread has the following capability sets;
This is the set of privileged activities that the kernel performs permission checks on before a thread can accomplish a task.
The capabilities in this set are transferable between parent and child processes after an execve system call for privileged programs. This will be discussed in more detail in a future post dealing with fork/clone and execve system calls
The permitted set serves as a limiting superset for the effective set. The capabilities that are not set in the permitted set cannot be enabled in the effective set except;
- The the program file capability set contains the capability in its permitted set.
- The program it (the thread) is executing is with the set-user-ID-root.
It also limits the capabilities that can be inherited if the CAP_SETPCAP capability is not present in a thread's effective capability set.
This capability set is useful when a non privileged thread needs its privileges preserved during an execve system call. The ambient capability set allows the transfer of capabilities during the execve systems call. They are preserved across a process that is unprivileged.
This capability set is a limiting superset for capabilities that can be added to the inheritable during an execve syscall. It is also a limiting factor for permitted set because its AND'ed to the permitted set during execve.
The capability sets attached to a thread or a process can be read from the /proc/pid/status file where pid is process or task ID. For example to see the capabilities the current process is using, we can run the command below;
The $$ is a special bash parameter representing the current process so the command below will print the current process ID.
The file /proc/pid/status contains a lot more information about the process ID under observation. The screen-dump below is a grep of just the capabilities section of the output of my current shell process.
boye@hp7940m1:~/Documents/dev/capabilities_show$ echo $$ 13575 boye@hp7940m1:~/Documents/dev/capabilities_show$ grep Cap /proc/13575/status CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 CapBnd: 0000003fffffffff CapAmb: 0000000000000000 boye@hp7940m1:~/Documents/dev/capabilities_show$
The capabilities sets introduced earlier can be seen in the output above. They are in hexadecimal form with each character representing a nibble(four bits). The individual capabilities are bit positions in the 64 bit output for each capability set. Setting the bit(1) in the position enables the respective capability, while clearing it (0), disables the capability for the capability set.
The arrangement of the capabilities in the 64 bit data structure is defined in the header file /usr/include/linux/capability.h. The content of this file is determined by the kernel version so you will find that different kernel versions can have varying levels of support for capabilities. The bit positions are numbered from 0 to the latest supported by the kernel. To check the latest capability supported on a system;
The output from my system is shown below. It does not have the CAP_BPF and CAP_PERFMON capabilities introduced in Kernel version 5.8.
boye@hp7940m1:~/Documents/dev/capabilities_show$ uname -a Linux hp7940m1 5.4.0-80-generic #90-Ubuntu SMP Fri Jul 9 22:49:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux boye@hp7940m1:~/Documents/dev/capabilities_show$ cat /proc/sys/kernel/cap_last_cap 37 boye@hp7940m1:~/Documents/dev/capabilities_show$
The cap_last_cap file output of 37 means that the kernel has support for positions 0 - 37 which means that 38 capabilities are supported. This can be seen in the capability bounding set for the current shell process.
An examination of the values shown there are nine f(1111 in binary) characters and one 3(11 in binary) character. That gives 36(9X4) + 2 ones which means there are 38 bit positions set which is all the capabilities supported on the system.
To see a human readable translation of the hexadecimal representation, you can use the capsh utility.
boye@hp7940m1:~/Documents/dev/capabilities_show$ capsh --decode=0x0000003fffffffff 0x0000003fffffffff=cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read boye@hp7940m1:~/Documents/dev/capabilities_show$
The output above shows the capabilities enabled in the capability bounding set for the current shell process.
With the foregoing, we have enough background to see the practical applications of capabilities. We will start that examination with systemd in the next post.