In the C++ programming language, the keyword auto
may be used in place of an explicit type in joint declaration-assignment statements, such as auto x = 5;
wherein auto
resolves to int
at compile time. There's much to be said about auto
, and there will probably always be strong disagreements regarding when and why it should or shouldn't be used. This post includes my general guidelines for auto
, but has been written primarily primarily for the purpose of discussing auto
's relationship with the pointer keyword, *
.
If you're not yet familiar with auto
and the arguments that surround it, check out these links:
- https://docs.microsoft.com/en-us/cpp/cpp/auto-cpp?view=msvc-160
- https://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style-almost-always-auto/
- http://www.randomprogramming.com/2014/10/auto/
Nuance in auto
Consideration
I was introduced to programming through BASIC, ECMAScript (ActionScript & JavaScript), and later PHP, all of which lacked support for explicitly typed declarations entirely1 - although they use different keywords (let
, var
, $
), they're effectively all-auto
all the time. When I got into C#, then, it was more natural to use var
wherever possible, and this appeared to be the norm in most of the examples that I found, and even seemed to be recommended by my IDE2. The next major language I learned was C++, using Scott Meyers' Effective Modern C++, which suggests "prefer auto to explicit type declarations", so I continued my preference toward implicitly typed declarations.
Nowadays, I do most of my work in Unreal Engine, an open-source game engine in which auto
is scarce. For much of the work that I do in Unreal Engine, particularly modifications to the engine itself, it's best that I conform as closely as possible to Epic's own code style instead of following my own preferences. Naturally, this has required me to write a lot of explicit type declarations.
Eventually, I started to appreciate this approach, and now prefer the explicit style more often in my own code. It’s particularly useful when using a code-review system like Swarm that, unlike an IDE, doesn’t provide an easy way of discovering the type that is implied by auto
. The extra information that type declarations provide often tell a meaningful story about data transformations, which may be less easily apparent from reading the corresponding function calls and other statements.
// type declarations alone can communicate a story
database_connection ... = ...
collection<thing> ... = ...
thing& ... = ...
rating ... = ...
bool ... = ...
Thinking about the debate between auto
and explicit, I'm reminded of this famous remark from Torvalds3: "I'm a huge proponent of designing your code around the data, rather than the other way around... Bad programmers worry about the code. Good programmers worry about data structures and their relationships."
I haven't swung entirely to the "never use auto
" camp, however. The matter is too nuanced to reasonably conclude anything less than a multifaceted and situationally informed set of guidelines. Those guidelines are likely to be different for a render pipeline than for a container library. In any case, if you're working in C++, chances are you're working on system architecture to some extent (rather than simple procedural logic), and if you're working on system architecture, you're likely to run into situations that even the most auto
-averse can't deny are good use cases for the keyword:
// UE4 style:
TSharedRef<TMap<TEntity<EntityType>::FToken, TTimestamped<EntityType>::FData>>> HistoryByToken = MakeShared<TMap<TEntity<EntityType>::FToken, TTimestamped<EntityType>::FData>>();
// stdlib style:
shared_ptr<map<entity<entity_type>::token, timestamped<entity<entity_type>::data>>> history_by_token = make_shared<map<entity<entity_type>::token, timestamped<entity<entity_type>::data>>();
That's an eyesore, despite being a perfectly reasonable structure to declare. I think we can all agree to prefer
auto HistoryByToken = MakeShared<TMap<TEntity<EntityType>::FToken, TTimestamped<EntityType>::FData>>();
or, if you've managed to upgrade from an 800x600 terminal to a ≥ 24" monitor in the past two decades, you might even prefer
auto HistoryByToken = MakeShared
<
TMap
<
TEntity<EntityType>::FToken,
TTimestamped<EntityType>::FData
>
>();
So, if you're writing C++, you can count on using auto
at least occasionally. Where you ought to use it depends on the type of programming that you're doing. Chances are, there will be at least one scenario where you use it for a pointer type.
auto
and Meta-Typing
The Case
I tend to use auto
whenever the exact type that it resolves to is explicit and prominent elsewhere in the statement. This usually implies using casts or factories:
const auto DamageProfile = FDamageProfile::Factory(DeltaSeconds, Velocity, Shape);
const auto Enemy = Cast<AEnemy>(OverlappedActor);
auto EnemyDamageEvent = MakeUnique<FDamageEvent>(Enemy, DamageProfile);
However, I've recently noticed a few shortcomings in the above approach's ability to communicate important details. In fact, in addition to presenting a clarity problem, this style also introduces mechanical problems: using auto
in the way that I have on the second line effectively prohibits me from achieving the full extent of potential const-ness.
Can you spot the problem?
Here's how I would prefer to write that same code now:
const auto DamageProfile = FDamageProfile::Factory(DeltaSeconds, Velocity, Shape);
const auto* const Enemy = Cast<AEnemy>(OverlappedActor);
/*TUniquePtr*/auto EnemyDamageEvent = MakeUnique<FDamageEvent>(Enemy, DamageProfile);
This new preference facilitates two distinct goals:
Meta-Type Clarity
In C++, it's often very important to know when you're dealing with a local object, a reference, or a pointer (the type of the type, or "meta-type", if you will). auto
has the ability to obscure this important information.
While it's possible to conclude that Enemy
in the above example is a pointer, from knowing what Cast
does, other examples can be far less obvious. Consider,
auto Related = GetRelated<FConnection>(Primary);
Is Related
a FConnection
, FConnection*
, or TSharedPtr<FConnection>
? Concluding that Enemy
or Connection
is a pointer relies on off-screen information, but concluding that they represent AEnemy
and FConnection
respectively can be concluded entirely from on-screen information4.
This additional clarity, gained by including *
, is further enhanced by the fact that our code editing programs nowadays will display auto
, *
, Enemy
, and AEnemy
in different colors, helping us to visually scan for types and meta-types.
Const and Order
More compelling yet, const auto
is misleadingly counter-intuitive when auto
resolves to a pointer:
const auto VarA = Type{};
const Type VarB = Type{};
// these are effectively the same.
const auto VarC = new Type{};
const Type* VarD = new Type{};
// these are substantially different in their meaning.
Assuming that Type
has a public member field int X
, VarC->X = 5
will compile successfully, but VarD->X = 5
will fail to compile.
In fact, it isn't even possible to declare "the reference itself and the referenced data are both const" in a single statement if you've allowed auto
to encapsulate the *
. const auto* const Var
is the only way, apart from giving up on auto
altogether (const Type* const Var
).
I would go as far as to argue that it would be beneficial to strictly prohibit yourself and your team from writing statements that cause auto
to encapsulate a declaration's *
(s), and that this is true for all codebases whatsoever. As soon as there's a single case in your code in which you place const
to the right of the type to say "the pointer itself will not change", then having any cases in your code where const
to the left of the type says "the pointer itself will not change" presents potentially misleading inconsistency.
Wrap-Up
C++ has a lot of syntactical ambiguity. Compared to other languages, it's slower for both humans and machines to understand, due in part to a common reason: Keywords and common symbols can mean completely different things based on potentially far-flung context. Whether or not you're bearish or bullish about auto
, there will probably be a few scenarios where you use auto
to declare a pointer variable. In such cases, you can avoid further burdening yourself and your peers with ambiguity by keeping pointer-declaring *
characters out of auto
's "grasp" by writing auto*
instead.
Footnotes
1: Some of these languages now support explicit type declarations one way or another, but that wasn't true at the time I was personally focused on them.
2: It did seem like a "recommendation" at the time, but now I recognize that the indication that ReSharper applies to explicit type declarations is actually a "hint", a "you can do this" rather than a "you should probably do this".
3: Ironically, of course, Torvalds' own Linux tends to violate this idea flagrantly and thoroughly, compared to its competitors. Granted, he inherited those problems from UNIX. I suspect that the Linux Kernel's C API does indeed have well-defined data structures, in contrast to its string-obsessed console environment and core tools.
4: The confidence with which you can assume this is equal to the confidence you have in your team to not violate the guideline, "use auto
whenever the exact type that it resolves to is explicit and prominent elsewhere in the statement".
Top comments (2)
Hello
auto
never deduces a reference type. Hence,Related
can either be aFConnection
or apointer to FConnection
, nothing else.I must admit that this sample tricked me... But it may sound quite logical when you decompose the statement :
auto VarC = new Type{};
=>VarC
is obviously of pointer.const
=> you have a constant pointer.The reasoning is not symmetrical with explicit type, I grant you that...
I think you're right. Hiding pointers behind typedefs is often considered bad practice in C. We may consider that hiding pointers behind
auto
also as a bad practice too. I will pay attention to in the next months, and I will see ifauto*
improve readability in my code.One might argue that you have should avoided raw pointers in the first place, but as an embedded system developer, I know that pointers always find there way out ;)
Oops! Good catch regarding the reference type. I'll have to fix that in the article.
Thanks for adding the point about that best-practice in C. You're right, it's effectively the same concern. I've always found pointer typedefs to be inconvenient. Certain UE4 subsystems use them a lot.