Another Monday, another blog post!
I’ve been diving into the curl codebase over the past couple of blog posts, but something else has spiked my interest today, so I figured I might as well dig into it while the curiosity is still hot.
To be honest, I’m reluctant to dive into this because the last time I wrote a blog post about a Unix-related topic the — let’s call them the group of individuals with too much time on their hands and a lot of petty on their hearts — took me to task on some of the substance in the blog post in a way that wasn’t too nice. And by “wasn’t too nice” I mean hella racist and sexist. In any case, I figure there will always be haters (and people with unhealthy attachments to operating systems and harassing strangers on the Internet), so I might as well carry on.
OK. Enough blabber. I’ve been working through a backlog of issues on the Zarf app. As such, I’ve been spending a lot of time on the command line. The backlog involved deleting a lot of code (insert satisfied sigh here) and sometimes this involved deleting entire files of source code (insert doubly satisfied sigh here). This got me wondering: what’s going on when you run rm
on the command line. There’s a couple of variants of the rm
command that I commonly run.
$ rm settings.json
$ rm -rf config/
Anyways, I wanted to dive into what is going on under the hood with rm
, so I decided to start by determining the syscalls invoked by the rm
command.
captainsafia@eniac ~/zarf> sudo dtruss /tmp/rm History.md
dtrace: system integrity protection is on, some features will not be available
SYSCALL(args) = return
open("/dev/dtracehelper\0", 0x2, 0xFFFFFFFFE9A3EB10) = 3 0
ioctl(0x3, 0x80086804, 0x7FFEE9A3EA70) = 0 0
close(0x3) = 0 0
access("/AppleInternal/XBS/.isChrooted\0", 0x0, 0x0) = -1 Err#2
thread_selfid(0x0, 0x0, 0x0) = 3765033 0
bsdthread_register(0x7FFF56790BEC, 0x7FFF56790BDC, 0x2000) = 1073742047 0
issetugid(0x0, 0x0, 0x0) = 0 0
mprotect(0x1061CA000, 0x1000, 0x0) = 0 0
mprotect(0x1061CF000, 0x1000, 0x0) = 0 0
mprotect(0x1061D0000, 0x1000, 0x0) = 0 0
mprotect(0x1061D5000, 0x1000, 0x0) = 0 0
mprotect(0x1061C8000, 0x88, 0x1) = 0 0
mprotect(0x1061D6000, 0x1000, 0x1) = 0 0
mprotect(0x1061C8000, 0x88, 0x3) = 0 0
mprotect(0x1061C8000, 0x88, 0x1) = 0 0
getpid(0x0, 0x0, 0x0) = 77384 0
stat64("/AppleInternal/XBS/.isChrooted\0", 0x7FFEE9A3E148, 0x0) = -1 Err#2
stat64("/AppleInternal\0", 0x7FFEE9A3E1E0, 0x0) = -1 Err#2
csops(0x12E48, 0x7, 0x7FFEE9A3DC80) = 0 0
dtrace: error on enabled probe ID 2190 (ID 557: syscall::sysctl:return): invalid kernel access in action #10 at DIF offset 28
csops(0x12E48, 0x7, 0x7FFEE9A3D570) = 0 0
geteuid(0x0, 0x0, 0x0) = 0 0
ioctl(0x0, 0x4004667A, 0x7FFEE9A3F954) = 0 0
lstat64("History.md\0", 0x7FFEE9A3F8F8, 0x0) = 0 0
access("History.md\0", 0x2, 0x0) = 0 0
unlink("History.md\0", 0x0, 0x0) = 0 0
Sidebar: Usually, you determine the syscalls utilized by a command by using strace
. I’m on a Mac so strace
isn’t available. Instead, I used a tool called dtrace
. To allow it to process the rm
command, I had to make a copy of the executable into a temporary directory and execute that. All this to say, this is why I’m executing dtrace /tmp/rm
above instead of strace rm
.
So, anyway, let’s look into what’s going on above. The first couple of lines in the trace seem to be pretty clearly related to setting up the sudo
part of the command. I was intrigued by the calls to the mprotect
command. I figured that it might be something related to memory addresses because the first parameter passed to the mprotect
function looks like a memory address. I decided to head over to Google to see if I could find the documentation for this function and confirm this. I find the documentation here and my suspicions were confirmed. The function is responsible for setting the access rights on memory for the calling process. The function declaration looks like this int mprotect(void *addr, size_t len, int prot)
where addr
is the start of the memory range, and len
is the length of range of memory addresses that will be changed, and prot
is an integer that represents how the memory should be protected. I looked into what the different values for prot
that were passed into the mprotect
function calls above and figured out the following.
-
0x0
is the code forPROT_NONE
meaning that the memory cannot be accessed for writes or reads. -
0x1
is the code forPROT_READ
meaning that the memory can be read. -
0x3
is a bitwise OR of the values forPROT_READ
andPROT_WRITE
which means that it allows that memory to be both read or written.
I wasn’t sure what the memory addresses that were referenced in the mprotect
call actually corresponded to or what the best way to figure it would be.
The getpid
command has a pretty self-explanatory name, but I wondered what the parameters that were passed to the function were. As it turns out, the getpid
function supposedly takes no parameters, so the inclusion of the parameters in the call above perplexed me.
I was also unsure of what the csops
and ioctl
calls did. As it turns out, this is actually pretty warranted. Some investigation revealed that csops
is a system call that is unique to the Apple operating system and can be used to check the signature that is written into a memory page by the operating system. It seems to be some way of checking the validity of a particular chunk of code. I’m not too sure about it, and there isn’t a ton of information about Apple-specific syscalls so I can’t dig into the details of this as well as I’d like.
The ioctl
syscall is a pretty versatile one and is responsible for all input/output related interactions. According to the manpage, the first argument represents a file descriptor, the second is a reference to a “device-dependent request code”, and the third is a pointer to memory. I dug around to figure out what the request code “0x4004667A” and realized that it was pretty commonly associated with invocations of the dtrace
command so I figured that this invocation was not related to the task of removing the file referenced.
I was interested in the last syscall invoked in this strace, unlink
. I headed back to Google to learn some more about it and came across this manpage. The unlink
command is the one that is actually responsible for removing the file.
All in all, the most rm
-related parts of the trace are in the last few lines. Everything else seems to be setup associated with setting up memory permissions, ensuring the status of the related files, and setting up the dtrace
command.
So there’s that! I’ll admit that I’m not sure how I feel about this trace approach to antropology. It’s a little too direct, I almost like wallowing in the code and getting last in the complexity of it. There is something therapeutic about it. I’ll see if I get more comfortable with this technique in future code reads. Until then, see you next time!
Top comments (0)