I wanted today to check what the installed
wc program was actually doing (a peer mentioned a bizarre behavior in the reported number of lines in the file, reported by
wc -l). The manual specified this is a count of the newline characters in the file.
Sometimes, when people say "newlines" or "lines" they mean slightly different things, so I wanted to verify this was in fact only counting the occurrences of the byte 0x0a ('\n').
So I downloaded the source. I use debian, and remembered you could call
apt-get source $package or
apt source $package to get both the upstream (original) version and any vendor provided patches. I helpfully knew that
wc is from a GNU package named
coreutils so I ran what seemed like a reasonable command (it worked the first time):
sudo apt source coreutils
This spit out a bit of information - the last warning was confusing to me, the earlier parts are expected - fetch the package description (pgp signed message from Michael Stone, a debian maintainer that includes the package list, dependencies, and checksums for the files), the original upstream package from GNU, the pgp signature for the original upstream, and the applied patches from the debian maintainers not present in the upstream package (diff)
Reading package lists... Done Need to get 5,584 kB of source archives. Get:1 http://deb.debian.org/debian bullseye/main coreutils 8.32-4 (dsc) [2,096 B] Get:2 http://deb.debian.org/debian bullseye/main coreutils 8.32-4 (tar) [5,548 kB] Get:3 http://deb.debian.org/debian bullseye/main coreutils 8.32-4 (asc) [833 B] Get:4 http://deb.debian.org/debian bullseye/main coreutils 8.32-4 (diff) [33.0 kB] Fetched 5,584 kB in 2s (2,910 kB/s) dpkg-source: info: extracting coreutils in coreutils-8.32 dpkg-source: info: unpacking coreutils_8.32.orig.tar.xz dpkg-source: info: unpacking coreutils_8.32-4.debian.tar.xz dpkg-source: info: using patch list from debian/patches/series dpkg-source: info: applying 61_whoips.patch dpkg-source: info: applying 63_dd-appenderrors.patch dpkg-source: info: applying 72_id_checkngroups.patch dpkg-source: info: applying 85_timer_settime.patch dpkg-source: info: applying 99_kfbsd_fstat_patch.patch dpkg-source: info: applying restore-ls-behavior-8.31.patch W: Download is performed unsandboxed as root as file 'coreutils_8.32-4.dsc' couldn't be accessed by user '_apt'. - pkgAcquire::Run (13: Permission denied)
I didn't understand why the sandbox warning showed up - and I didn't know where the files went. I assumed /usr/src/ was a good place for system sources, and saw nothing there (this would also be the place the kernel sources are expected, and in a bsd system most of the sources for installed packages might have gone here).
I didn't see much help in the man page for apt source (which is medium hard to search for because an unrelated meaning of source is "which servers or disks should apt check for packages when installing").
Ultimately, the sandbox warning was explained when I found these four files had been downloaded in the current working directory (my home in this case) and that the intent was to set the files to my user's permissions. Calling apt with
sudo in this case didn't help (the commands ran as root, and couldn't sandbox root owned files).
What I learned:
apt sourcein a directory you want the source tree to reside.
- don't call
apt sourceas root if you want to access the sources as a user.
Oh, and the initial question, about wc? It's comparing directly for the
\n character - as documented and shown in the source. Also, wc is way more complicated than I imagined it would be (because of multibyte characters in one case, and optimization to use cpu vector extensions when present in another), but the main logic looks like it only increments the lines counter when it encounters a newline character (what I set out to confirm):
case '\n': lines++;