Those of you who use Linux probably execute various commands on Linux on a daily basis. You might use the term "command name" to identify these, but depending on the context, the meaning of this term can vary. This article explains what the Linux kernel considers a command name.
First, I will present a brief conclusion, followed by a detailed explanation, and finally, I will describe the motivation for this investigation and the subsequent research process.
From the Linux kernel's perspective, the command name is the first 15 bytes of the basename of the executable file name (the file name without the directory part).
It is stored as a NULL-terminated string in a 16-byte field called
comm within a structure called
task_struct, which exists for each process in the kernel's memory (more precisely, for each kernel-level thread).
This enables the kernel to identify processes with low cost and higher readability than using a pid.
This command name is used in kernel logs, by commands such as
pgrep, and in packages like
procps. Longer command names are truncated due to the 15-byte limit mentioned above.
Software versions used for the investigation
- Linux kernel: v5.15
- Procps: 3.3.17
The motivation for investigating what was mentioned in the "TL;DR" section came from the fact that the
pgrep command I used in my custom program did not work correctly. The
pgrep command takes a string specified as an argument as a regular expression and retrieves a list of pids of running processes that match it. For example, below is an example of running an infinitely sleeping script called "foo.sh" and then using
pgrep to display its pid.
$ cat foo.sh #!/bin/bash sleep infinity $ ./foo.sh &  1086408 $ pgrep "foo\.sh" 1086408
However, when I tried the same thing with a script called "foo-bar-baz-hoge-huga.sh" that does exactly the same thing as "foo.sh", grep did not display anything.
$ cat foo-bar-baz-hoge-huga.sh #!/bin/bash sleep infinity $ ./foo-bar-baz-hoge-huga.sh &  1086868 $ pgrep "foo-bar-baz-hoge-huga\.sh" $
I thought it was odd, so I looked at man
pgrep and found the following description.
The process name used for matching is limited to the 15 characters present in the output of /proc/pid/stat.
In fact, when I looked at the
/proc/pid/stat file for "foo-bar-baz-hoge-huga.sh", I got the following output.
$ cat /proc/601235/stat 601235 (foo-bar-baz-hog) S 593786 601235 593786 34817 601419 4194304 224 0 0 0 0 0 0 0 20 0 1 0 5735606 8617984 900 18446744073709551615 94266299658240 94266300571405 140732967030208 0 0 0 65536 4 65538 1 0 0 17 1 0 0 0 0 0 94266300816048 94266300864080 94266304847872 140732967036675 140732967036712 140732967036712 140732967038941 0
The string displayed inside the parentheses in the second field, which shows the command name, did indeed match only the first 15 characters of the script name, not the entire name.
Although I understood the specification itself and realized that my usage of pgrep was incorrect, I decided to verify where this 15-character limit came from.
The files under the
/proc/ directory are provided by a file system called
procfs. Unlike file systems such as
XFS that manage data on disk,
procfs exists for users to obtain kernel information and modify the kernel state through files. We will not go into the details of
First, let's check the specifications of the
/proc/pid/stat file. The specifications of files under procfs are described in man procfs. The following is an excerpt of the relevant part:
Status information about the process. This is used by ps(1). It is defined in the kernel source file fs/proc/array.c.
(2) comm %s
The filename of the executable, in parentheses. Strings longer than TASK_COMM_LEN (16) characters (including the terminating null byte) are silently truncated. This is visible
whether or not the executable is swapped out.
We can see that the second field of the
/proc/pid/stat file contains the name of the executable file in parentheses, and that any part exceeding 16 bytes, including the NULL terminating string, is ignored. Subtracting 1 byte for the NULL character from 16 bytes gives us 15 bytes, which matches the information written in the
Next, I looked at the kernel source to see where this string is actually being output and where the data is stored. The
procfs manual states that the
/proc/pid/stat file is defined in the
fs/proc/array.c file in the kernel source, so I first looked at this file.
The relevant code seems to be in the following part of the
seq_puts() function is called, it outputs the specified string to a file. In the code above, lines 562 and 564 output "(" and ")", and it can be inferred that the command name is probably being output to a file by the
proc_task_name() function on line 563.
Before looking at the contents of
proc_task_name(), I decided to first check if the
do_task_stat() function is actually called when the
/proc/pid/stat file is read. I traced the call stack of the
do_task_stat() function and found that it is called in sequence from two functions,
In the kernel,
tid refers to the thread ID, and
tgid refers to the process name, so we can guess that the
proc_tgid_stat() function is probably the caller. There are functions that display the state of threads under the
/proc/pid/task directory in procfs, so the
proc_tid_stat() function is probably the handler for the
Tracing further back the call stack of these functions, I found that in the
proc/pid/base.c file, which registers handlers to be called when users read and write files in
proc_tgid_stat() function is registered to be called when accessing the
/proc/tgid/stat file, or in other words, the
In summary, I found the following:
- The user reads the
proc_tgid_stat()function is called
do_task_stat()function is called
proc_task_name()function is called to output the command name to the file
Upon examining the implementation of the
proc_task_name() function, it looks like this:
I will omit the details, but when the process indicated by the pid is a regular program, the evaluation result of the if statement on line 103 is false. This evaluation result is true only in the case of special processes created within the kernel.
Furthermore, since the escape argument of the
proc_task_name() function is true when called via the
proc_tgid_stat() function, the evaluation result of the if statement on line 108 is true. Therefore, we can see that the data obtained by the
__get_task_comm() function (probably a NULL-terminated string) is being used as the output for the
/proc/pid/stat file on line 109 within the
proc_task_name() function. The
seq_escape_str() function on line 109 escapes special characters and spaces, but I will not explain the details here as it is not important for this article.
Now, let's look at the contents of the
We can see that the value of
tsk->comm, or more precisely, the value of the comm field of a structure named
task_struct, is the source of the command name information. The
task_struct structure exists for each thread. Let's take a look at the definition of the
We can see that the
comm field is an array of char with a length of 16. The
procfs manual also mentioned that the length of
TASK_COMM_LENis 16 bytes.
Confirming where the value of
task_struct->comm is set
__set_task_struct() function sets the value of
The caller of the
__set_task_struct() function is the
This function is called when the
execve() system call, which creates a new process, is invoked.
The bprm->filename contains the name of the executable file corresponding to the process as a NULL-terminated string. Here, we can see that the name of the executable file is processed using the
kbasename() function and then saved in
kbasename() function, similar to the
basename() function in the standard C library, returns a string with the directory part of the file name removed. Therefore, if the executable file name is "./foo.sh", "foo.sh" will be stored in
task_struct->comm, and if it's "./foo-bar-baz-hoge-huga.sh", "foo-bar-baz-hog" will be stored. Finally, I understood the definition of the "command name" in the
/proc/pid/stat file, or, in other words, as referred to by the Linux kernel.
Lastly, by reading the procps source code, I found out that the string output by
pgrep is, as described in the man page, the longest 15 characters excluding the "(" and ")" from the second field of the
Since there is nothing particularly interesting going on.
We now understand that the command name, as referred to by the Linux kernel, is the first 15 bytes of the basename of the executable file. However, why is it processed with the basename, and why is it truncated to a maximum of 15 bytes? The reasons are probably as follows:
To identify processes through kernel logs and other means, it is convenient to have easily accessible information in the form of a string, separate from the process ID (pid). The name of the executable file can be used for this purpose. However, storing the full executable file name in the
task_struct structure may consume a large amount of kernel memory and could potentially create a security vulnerability if a malicious user executed a program with an excessively long file name. Therefore, storing the entire file name is not feasible.
One might think that it would be sufficient to look at the value of the executable file name stored in the process memory. However, this is not necessarily true. When accessing the process memory from the kernel, if the relevant memory might be swapped out, it is necessary to swap it back in before reading, which can be cumbersome. Moreover, this approach cannot be used in situations where the system is running out of memory, for instance, when the kernel needs to log the lack of memory. It is not possible to increase memory usage when there is already a shortage.
The reason for using the basename, such as "foo.sh" instead of the file name or full path specified at runtime like "./foo.sh", is likely due to the decision that the basename still provides sufficiently high visibility. In most cases, the basename is enough to recognize and identify the process without using the full path.
In this article, I desceived why the command name specification in the Linux kernel is as it is. Additionally, I wrote about the process of finding answers to small questions that arise while using a computer by reading source code, allowing readers to relive the experience of source code reading. Neither of these provide immediately useful knowledge, but I hope they can serve as tidbits of information.