DEV Community

Max Pixel
Max Pixel

Posted on

Extrinsic Role Headers: Standardizing Member Order for Faster and Deeper Comprehension

"Organize your class like shown in this link. You might think it's a hassle at first, but when you get used to it, it's rly nice :)"

Reading this was one of my proudest moments as a software developer. Not top 5, but maybe top 10. It was a code-review comment, posted by one of my teammates at the time, on a pull-request from a new hire. It was heart-warming, and relieving, to learn that the unconventional standards I had imposed on my team ("this link") were not only appreciated, but enjoyed so much that my colleague practiced them as a matter of preference, and even went on to spread the good word and pitch them to others.

Today, I'd like to share that particular set of standards with you, too. It's called Extrinsic Role Headers, and it's a methodology for organizing and labeling class members, that focuses on improving comprehension speed while also reinforcing clean code practices. If I had to summarize it in one sentence, I would say: Extrinsic Role Headers answers the question "am I going to break any preconditions by interacting with this member in a particular way from a particular place?" to the extent that a class-member organization methodology can do so.

Backstory

When I was less than a year into learning C#, I remember coming across an article about the "proper" order in which to organize members within a class. I don't remember the exact one at this point, or all of the rules - It definitely called for putting constants before fields, and fields before functions, for example. As a novice programmer, it's exciting to come across rote best-practices like this. It gives you something easy to do to feel in control of your code, and to feel more professional and in-the-know. Those "benefits" are, of course, the siren song of software cargo-cults, but there is legitimate utility to standardizing member order. In addition to the many innate benefits of style cohesion within a team, standardized member order solves the following problems, among others:

  • A lack of consistent organization means either spending more time re-reading files from top to bottom, or missing details.
  • Using a different methodology over time, or a different methodology from others on your team, eventually results in nearly the same situation as not using any methodology at all.
  • Using thoughtful consideration without the help of a rigid framework takes more time, slowing you down every time you add a member to a class.

Later, though, I began to question the wisdom of the specific methodology I had first encountered. At first, I just wanted to keep backing-fields near their properties. It felt more natural, and was legitimately more convenient:

class Foo
{
    public int Bar => _bar;
    private int _bar;

    public int Foo => _foo;
    private int _foo;
}
Enter fullscreen mode Exit fullscreen mode

Working on games in Unity, in particular, led me to begin bending the rules even further, as game developers (both as a product of culture, affinity, and systemic pressures) tend to be relatively bad about adhering to the single-responsibility principle (accepting even that it's not a steadfast principle, as "single" and "responsibility" leave a lot of room for interpretation). Organizing the class into its constituent topics, like "health" and "attacks", was really convenient! And while doing this was incompatible with the rote methodology I had originally read about, it wasn't beyond standardization. In some cases, I remember sticking to constants-fields-functions order at the macro level, and then grouping by topic within each of these, and further by accessibility within each topic. Some of my earliest professional work as a game developer had three tiers of section headers, utilizing a range of underline styles from ###### to ===== to -----.

At some point, I began to realize that my categorization approach was a code-smell. And as I got better at using design patterns and downsizing classes, the need for topical grouping dissipated. But application-specific topics weren't the only headers I had picked up at that point. For example, I had been filing serialized configuration and transient state under different headers - most Unity and Unreal developers group serialized fields together by implicit convention, I was just making it explicit. These headers weren't so smelly, so I kept them.

As I continued to refine my approach to organizing class members across many projects, both personal and professional, it evolved from a volatile habit to a thoroughly considered methodology. A few years ago, while making a concerted effort to resolve some inconsistencies in my headers, and to figure out how to reconcile differences across the various languages and frameworks that I used it in, I discovered a particular core principle I could use to ground my class organization methodology in a way that made it cohesive and novelly useful. You'll find this principle explained in the below "Methodology" section.

Extrinsic Role Headers (that's what I've decided to call it) is at a point now where I feel there's little left to refine, apart from minor project or team-specific tweaks. I'm no longer questioning and revising it whenever I start a new codebase, so it's time to document it publicly and call it version 1.0.

Interestingly, just a few days after finishing the first draft of this article, while performing a completely unrelated search for a completely unrelated purpose, I happened across a software agency's style guide for Objective C - it happens to exhibit some similar ideas, and even contains some identical terminology. Clearly, others are having the same idea: there's something else we should be self-documenting in our code.

Methodology

Principles

As a class organization methodology, Extrinsic Role Headers is designed to be predictable and easy to follow, to provide consistency no matter who is contributing to the code. This means that it must be complete and discrete: There must be no ambiguity between categories, and the categories must account for all possible class members that anyone would ever conceive of (except for unquestionably bad decisions). However, it is designed to be these things for an experienced developer - for novices, it may be a bit challenging at first, but will force them to begin thinking in ways that make them better at programming.

It applies to any programming language that has classes. Perhaps it could also apply to languages like C and Go - I just haven't tried that yet. In fact, a guiding principle for avoiding ambiguity is that porting your code from one language to another should never cause any class member to require re-categorization. If it ever does seem to require as much, this implies that you're thinking about either category at the wrong level of abstraction, involving technicalities where you should be thinking about interface contract instead.

Beyond "complete and discrete", which I think is a baseline requirement for any class organization methodology (like the one I started with), Extrinsic Role Headers defines and differentiates itself by striving to communicate the relationship/contract (role) between the member and its class and "the outside world" (extrinsic), which are otherwise not apparent except as can be gleaned from subtle patterns or entire function implementations both within and beyond the host class.

In other words, it focuses on answering the questions that are asked most often while debugging and implementing, and yet are among the most time-consuming to answer: what is modifying this value, what is calling this function, how are data and execution expected to flow, am I going to break any preconditions by interacting with this member in a particular way from a particular place?

Headers

I've been saying "Extrinsic Role Headers" all this time, not "Extrinsic Role Grouping" or "Extrinsic Role Order". Each section must have a header, and there is a strict enumeration of header names available. Headers help with comprehension speed, by concisely expressing important, nuanced, standardized concepts. They help with spatial navigation by being visually distinct.

Headers are comments with a visually distinct style. The exact style doesn't matter, except that you should be consistent about which style you're using throughout any given project (and, ideally, throughout any given organization). This is the one I've been using lately:

/* Section Name *\
\****************/
Enter fullscreen mode Exit fullscreen mode

Here's another one I've used before:

// Section Name //
// ============ //
Enter fullscreen mode Exit fullscreen mode

Alternatively, you can use structs in place of headers (at least, for fields - not for functions):

class Foo
{
    public struct SectionName
    {
        public int Bar;
        public string Baz;
    }
    private readonly SectionName _sectionName;
}
Enter fullscreen mode Exit fullscreen mode

Whether the struct approach is "better" than comment-based headers depends on the situation (language, framework, application architecture, personal temperament), and the pros/cons vary by section (I've used this approach for Parameters only).

The order of field-headers and of function-headers is intended to flow from least to most *innate*, starting with the class' relationship to the outside world, ending with the class' hidden implementation details. This ordering is applied independently to fields and functions, with fields coming first, partly to avoid obfuscating the class' data layout, and partly because trying to compare the "innateness" of a field and a function is too much of an apples-to-oranges endeavor.

That said, the division of groups is ultimately the most important part of the methodology. The order prescribed here is less important. The order prescribed in your own team's documentation is important as far as consistency is important - In order for that consistency to stick, you had better write down a solid reasoning for the difference in order. My reasoning is this: The more extrinsic (less innate) a member is, the more relevant it will be to colleagues who did not write the class but must understand and use it.

Here is the enumeration of headers:

Configuration

This header communicates: The source of truth exists outside of the program runtime, and therefore they are constant throughout the instance's lifetime.

In game engines like Unity and Unreal, it's where you put your [SeriallizeField] and UPROPERTY(EditAnywhere, Config) fields. More broadly, it's where you put compile-time constants, design-time constants, and configuration-time constants. In all cases, these fields are to be loaded only once from disk or network. This loading is done "auto-magically" by the framework, unless there are technical or practical reasons you just had to settle for doing it explicitly.

Parameters

This header communicates: The world outside the class specifies these when creating an instance, and they remain unchanged throughout the instance's lifetime.

More often than not, Parameters come exclusively through the constructor. In some frameworks (my background in game engines is showing again here), custom constructors are simply not an option, so parameters must come through a conventional "initializer" member function.

Events

This header communicates: The outside world binds additional workloads to these activities, and those workloads may change over time.

Whether native to the language or framework, imported from a library, or home-made, "Events" covers any field that serves as means by which other classes can "subscribe" to changes or other occurrences within the host class, thereby extending it with additional side-effects.

Composition

This header communicates: These objects are owned by this class, their lifetimes are constrained by its lifetime.

This section is where "composition over inheritance" manifests. It's the filled-in diamonds in your UML diagram. The UML comparison is particularly useful: the difference between Composition-section object-references and State-section object-references is analogous to the difference between composite & aggregate association connectors.

Some frameworks use "Component" to refer to a specific feature, which is why "Composition" was chosen over "Components" (you may have components that are not Components).

If components are spawned & despawned throughout the object's lilfetime, they still belong in this section. The useful distinction here isn't "objects that have exactly the same lifetime", but rather, "objects that are created and destroyed exclusively by this one".

Handles

This header communicates: This is not the value itself, but grants control over the value, and the value's lifetime is coupled to both the value's host and the handle's host.

"Handle" refers to an object that allows modification of another object's data. A well-known example is the return value of setTimeout in JavaScript. They are often serial numbers, or other pointer-like constructs.

Handles represent temporary contracts for proxied/delegated ownership of some data. In this way, unlike other design patterns (queue, state machine, etc.), they are fundamental and inescapable within the context of "extrinsic role".

Cache

This header communicates: This information is derived or copied from an external source of truth - we could get away with removing this field, but having it provides optimization or convenience benefits.

Placing a variable in this section implicates it as needing invalidation.

State

This header communicates: This information is proprietary to this object and coupled to its lifetime, and may change over time.

When a field doesn't belong in any other section, the remaining conclusion is that its value over time is a product of its enclosing class' methods and/or event-responses. Placing a variable in this section implicates its involvement in serialization and undo history, if either is applicable to the class.

State and Composition are more similar than any other two categories. Ultimately, though, they are worth segregating as a way to communicate whether the field's type provides a layer of abstraction (State) or separation of concerns (Composition). The distinction in UML between Composite Association and Class Attribute is a useful analogue.

Lifecycle

This header communicates: These functions are called by the application's framework, and should never be called by application-specific code.

This section is for the constructor, destructor (or finalizer), and any other functions that are not supposed to be called during the object's lifetime. In a very simple application, there wouldn't be anything here apart from constructor and destructor. When using an application framework, however, you'll almost always have functions, either implemented from abstract base-classes or called "auto-magically" through reflection, that are used to coordinate transitions between distinct phases that imply how your object is supposed to behave or what it can expect from injections and annotations (loaded, serialized, shutting down). These functions are Methods or Subroutines from the framework's perspective, but from your perspective, they are significantly different: while it may technically be possible to compile code that invokes them, doing so will always result in runtime errors or illegal behavior.

Queries

This header communicates: This is the read-only portion of the class' interface.

This is home for all functions that return information without causing side-effects. In C++, all query functions would be const.

Methods

This header communicates: These are the means by which the outside world can imperatively control this object.

Some languages officially define "method" as simply meaning "member function", and some definitions go on to include static functions as also being methods. The Extrinsic Role Headers methodology embraces the stricter and therefore more useful definition, which at the time of writing is reflected on the Wikipedia entry for the term: they specify "how the object may be used", and are parameterized "by a user". In essence, this means any public member function that doesn't qualify as a Query or Lifecycle function.

Methods implicate reentrancy and parameter validation.

Event Response

This header communicates: These functions are bound to events on, or are passed as delegates to, other objects.

Like Methods, these functions are invoked from outside your class. But unlike Methods, your class controls which objects can call these functions, and retains the ability to grant and revoke access. Unlike Subroutines, you will never call these directly, and unlike Lifecycle functions, their semantics are application-specific.

Subroutines

This header communicates: These functions are only ever called by this class and its subclasses.

Subroutines are your primary means of de-duplication and aesthetic compartmentalization. This is how you accomplish "don't mix levels of abastraction in a single function". They are protected and private member functions that are never bound directly to events or delegates.

Inner-Types

This header communicates: This is where nested classes and structs go.

They aren't fields. They aren't functions. They need to go somewhere, if they exist at all. So, here's where they go.

In C++, it is sometimes necessary to define inner-types before certain other members. So, in C++ projects, It's best to standardize that this section always goes first, rather than last.

class FancyCollection
{
    /* Inner-Types *\
    \***************/
    public class MutableHandle : IDisposable
    {
        // ...
    }
}
Enter fullscreen mode Exit fullscreen mode

Cheat Sheet

/* Configuration *\
\*****************/
[LoadFromIni]
private ulong _speedLimit;

/* Parameters *\
\**************/
private readonly Name _suppliedViaConstructor;

/* Events *\
\**********/
public event Action QueueExhausted;

/* Composition *\
\***************/
private readonly NetworkSocket _owned = new();

/* Handles *\
\***********/
private CancellationTokenSource? _cancelsExternalCoroutine;

/* Cache *\
\*********/
private TimerSubsystem _globallyAccessibleButExpensiveToFind;

/* State *\
\*********/
[Serializable]
private Queue<Message> _pending;

private ulong _bitsPerSecond;

/* Queries *\
\***********/
public ulong BitsPerSecond => _bitsPerSecond;

public int GetPendingCount() => _pending.Count;

/* Lifecycle *\
\*************/
public Eponymous(Name name) { ... }
public void Dispose()
{
    _networkSocket.Dispose();
}

/* Methods *\
\***********/
public SendOperation Send(Message message) { ... }

/* Event Response *\
\******************/
private void OnTimerIntervalElapsed(ulong ticks) { ... }

/* Subroutines *\
\***************/
private void WriteHeader() { ... }

/* Inner-Types *\
\***************/
public class SendOperation { ... }
Enter fullscreen mode Exit fullscreen mode

FAQ

Q: How does this facilitate "clean" code practices?

A: Here are a few representative examples: being forced to choose between Query and Method should cause a developer who is about to write a function that both answers a question and modifies state to think twice and reconsider their approach. The distinction between Method, Event Response, and Subroutine reminds me of buggy code I've reviewed where "On" was haphazardly used to prefix functions of all three flows.


Q: This is anything but clean! Classes should be small enough that you can see them all at once on a single page with so few members that anything beyond rote, mechanical ordering should be sufficient.

A: The main problem with that idea is that it assumes a direct, uncomplicated, causal relationship between organizational granularity on comprehension and quality. The other problem is that it assumes going beyond "sufficient" is somehow a waste of effort.

Even with classes that do fit on a single screen, even when there are only three or two sections, Extrinsic Role Headers still bring something to the table. They are self-documenting. Yes, you can see that a group of functions are public, and that another group is prefixed with the word "On", and derive meaning from that without much effort, but utilizing just a few extra lines of text can reduce that effort and, more importantly, create an environment in which new and novice team members will be led to contribute better-organized code with less oversight.

Extrinsic Role Headers are even more compelling when classes are large or complex, yes. Sometimes, maybe not in your particular corner of the industry, "large" and "complex" classes are actually the ideal solution and not just a concession of technical debt. That topic ("when is big better") could take an entire book to fully interrogate. For now, it should suffice to say: Unreal Engine, the Linux kernel, and many other world-class feats of human ingenuity prove that complexity worth building sometimes calls for classes ranging from 5 to 5,000 lines long. If they were all under 30 or so lines each, you'd have formless code soup. Many software developers deal only in projects that are so small in scope that they never need "big" classes, which makes it hard for them to imagine that it's ever a good idea, but it's important to recognize that the virtue of such dogmas varies significantly by context.


Q: Why isn't there a section for Dependencies? Aren't those different from Parameters?

A: In software engineering discourse, "parameters" can imply immutable and structured information (strings, numbers, etc.), where "dependencies" implies objects that supply a service. But this distinction exists at a separate level of abstraction from the one in which Extrinsic Role Headers segregates field categories: As much as Foo may be a dependency, "which specific Foo instance should I use" is a parameter in the same way that "which file path should I write to" is a parameter. Dependencies can change over time, as well. Putting both constant and variable dependencies in the same section muddles the distinction of Parameters vs State.


Q: Where would you put functions that compute their return value rather than returning a State as-is?

A: These functions fit the given definition of Query. Any and every member function that is "pure" (whether through a strict language mechanism like const or by implied contract) can confidently be placed under the Query header. Whether that data is a simple copy or a complex derivation, is irrelevant to the fact that it's a Query. This is why the term "Query" was chosen instead of "Getter".


Q: Where would you put coroutines?

A: Functions that are async or IEnumerable or similar should be treated as if they were synchronous. Technically, yes, the compiler is splitting it up into several functions, some of which will be called by the framework. But the presence of control inversion alone isn't enough to justify a special header: consider that pauses, control hand-offs, and continuations occur mid-function even in functions that aren't coroutines, when the kernel gives a different process time on the core. Language-level coroutines exist at a significantly higher level of abstaction, manifesting in your syntax with keywords like await or yield, but remember that the scope of Extrinsic Role Headers is to consistently organize member declarations - intra-function syntax and generated functions are out of its purview. If you were to port your code to a language that doesn't support syntax-native coroutines, you would surely need to introduce additional "Event Response" functions - but in that case, the entry function would no longer be a coroutine and therefore would unambiguously match one of Method, Query, Event Response, or Subroutine. Following the rule that language-porting shouldn't cause any recategorizations, then, provides sufficient guidance on how to categorize coroutines.


Q: Where would you put Properties with a getter that can modify internal state or GetX functions that can modify internal state? These are methods, but they're often named like queries and treated as queries even in very professional code bases.

A: Field mutation in a "getter" is usually an anti-practice, one that this methodology hopes to discourage, but truly professional code bases will indeed do it for actually ideal reasons. This is why I chose to define the Queries header in terms of "side-effects" rather than only citing field mutation. The important level of abstraction for this consideration is interface contract, not CPU instructions and memory spans. It's the same thought process behind the appropriateness of C++'s mutable keyword, which denotes fields that can be modified from within const functions. Incrementing & decrementing a concurrent read counter for thread safety is the only example I can think of that clearly justifies this sort of code. Cached results may also be appropriate, depending on your temperament (e.g. lazy-loading getters). In these cases, you would still categorize the function as a Query despite the fact that it technically modifies a field or two. Thread safety counters and caches are invisible from an extrinsic point of view (resulting latency differences don't count as "visible" - they're indistinguishable from OS-level performance snags).

Top comments (0)