Why many Unix structs have Prefixes

#c #unix

One thing I’ve long been curious about is why many Unix C struct fields are named such that they are prefixed by a common abbreviation. For example, for the sockaddr_in struct:

struct sockaddr_in {
    short           sin_family;
    unsigned short  sin_port;
    struct in_addr  sin_addr;
};

all fields are prefixed by sin, an abbreviation for sockaddr_in. There are several other examples, as well, e.g., struct stat.

I originally thought it was perhaps just a style quirk of the original Unix authors. I tried searching for things like “early C style,” but never found anything.

I’ve also thought that perhaps they named the fields that way to allow the use of short pointer names like p. (We are after all talking about guys who named commands cp, rm, etc.) For example, given:

p->st_mode   // You know 'p' is a stat*.
q->mode      // You have no idea what this is.

The prefix of st_ allows you to know the type of p just by looking at it whereas you’d have to find the declaration of q to know what it means. Non-prefixed fields would require putting a mnemonic in the pointer itself:

pstat->mode  // Clear, but more keystrokes.

Prefixes also make fields easily grepable. Both rationales seem reasonable, right?

However, I finally stumbled across the real answer: early C compilers had only a single, global symbol table, so they added prefixes to struct fields to avoid collisions. 😮 Once C compilers improved, this style largely faded away.†

† Though I've since learned that Solaris’ internal style guide still recommends this style to this day, even in new code:

Systematic prefix conventions for ... structure or union member names can be useful.

But it doesn't elaborate as to why, specifically.

Top comments (2)

Nikola Stojaković • Jan 8 '22

This is very interesting. Who knows how many things we take for granted today are actually relics of the past. One of the most popular is 80 characters limit per line which in fact is pretty useful today (even though displays are much bigger now) - it makes reading the code much easier and you can set two editors side by side.

Paul J. Lucas • Jan 8 '22

I also still limit my lines to 80 characters. Over the years, I've also gradually reduced my spaces-per-indent. Waaaay back, I started out at parity (8 spaces, or a hard tab, per indent); at some point, I reduced that (4 spaces per indent); and finally again maybe a decade ago (2 spaces per indent). The obvious advantage is fewer lines that have to be wrapped so they still fit ≤ 80 characters. Some might think 2 is too few, but it's still quite readable, IMHO.