DEV Community


When Hungarian notation lies hidden in plain sight

scottshipp profile image scottshipp ・3 min read

Like all historical artifacts, Hungarian notation once made sense. Namely, it made sense in the absence of type checking. Start with what you think is an integer and somehow end up with a float array. Scary! Users of the language opted for the sensible idea of including the type of variable as a short prefix in front of its name as a way to make programs easier to reason about and prevent bugs.

In most programming environments today, this is an obsolete need and Hungarian notation has the opposite effect. Languages and IDE's together make it (at least nearly) impossible that a variable of a given type is used wrong. This makes the benefits disappear and the negatives appear: code is cluttered with random letters.

So far so good. Why then do similar patterns still live on at the class level? Take the most obvious example, the practice of prefixing interfaces with "I" that is found in some Java codebases:

public interface IDataSource {
  // ...

It would be better if the interface was just named DataSource. Changing IDataSource to DataSource is better because there's no need to overemphasize to calling code that it is using an interface. Standard practice in an object-oriented language like Java is to prefer interfaces due to their many advantages over classes. If you're interested in deeper learning on why, check out Item 20 of Joshua Bloch's book, Effective Java.

Of course, making such a change may present you with another problem. If the existing code has IDataSource, the interface, and DataSource, an implementing class, then what do you do? I have seen code where the prefix is just swapped to a suffix on the class, as in, DataSource the interface and DataSourceImpl the implementing class. This approach lacks elegance. When I am faced with such a situation, I have tried to find a more specific name.

If you would like some examples of good naming, look no further than some of the standard Java library packages like java.util.Collections. There you will find interfaces like List and implementations like ArrayList or LinkedList. Almost always, you will find with a little thought that a similar more specific name can be given to an implementing class you're dealing with. In the DataSource example, a name might be as generic as DBDataSource and that would still be better than DataSourceImpl.

Another related practice has been to include a variable's type in its name. For example, you might see String userJson or int countInt. Thankfully, this is less common, but still unfortunate. I think this happens most often when an editor doesn't have a mechanism for showing a variable's type. If you don't have an IDE that shows you the type of a variable with a hover, ctrl+click, or whatever, it's pretty easy to find one. I have found such mechanisms in Visual Studio Code, Atom, and IntelliJ to name just a few.

A more worrisome underlying cause for such a practice is that the variable's declaration is far away from its use. If you don't see where userJson is declared on the same screen as its used, then consider refactoring to put them closer together, and removing the json portion.

A strongly-typed language will help obviate the need for such "type suffixes" by providing helpful language mechanisms. In Scala, for instance, there is a type alias feature which allows for substitute names for types in a given scope. You might find a benefit in the ability to declare type PhoneNumber = String for instance, and use PhoneNumber userCell in a given block of code instead of the much more confusing String userCell.

As always, careful consideration of both names and types will yield more readable programs. The real test is if you can take some code to someone with no context, and they can still mostly understand it. Code review with a new developer or someone with another team often yields valuable insights that are otherwise not possible since we're all frogs in boiling water, so to speak. We slowly develop blind spots to confusing terms since we're so used to them. I like to always ask myself something like, "ten years from now, if I am no longer working in this codebase, and not even around, will someone be able to open this code and enhance it or fix a bug with a minimum of fuss?"

I always consider it the highest compliment if other developers like working in my code.


Editor guide
codemouse92 profile image
Jason C. McDonald

We definitely need to bring back Apps Hungarian!

The mangled twin, Systems Hungarian (i.e. intCount or boolIsRaining), is the madness that most people object to. That's where you use the data type as the prefix, and it helps no one. Please, for the love of all things digital, don't use Systems Hungarian!

Incidentally, I do use an odd little version of Hungarian Notation in GUI design. Every time I create a widget, I use a prefix relating to the widget type. It's something of a hybrid between Apps and Systems Hungarian, but it helps me avoid confusing widgets when writing code. For example, chkSubscribe would be a "Subscribe To Emails" checkbox, whereas btnSubscribe would be a "Subscribe" button - which might both have uses in a single interface (albeit a rather weird one, but hey, it's a rough example). Honestly, that's the only place I even come close to Systems Hungarian.

jamietwells profile image

Whenever I think about Hungarian Notation I'm always reminded of this post:

It's an excellent explaination of what the initial concept of Hungarian Notation was supposed to be.

Basically I think about it as a way to describe some information that the compiler can't check. If you are in a language like F# for example where you can make aliases for simple types then there's virtually no need for it. Simply make an alias for a JSON string and you no longer need to say userJson, you can give it the type JSON and the compiler will make sure you don't use it somewhere that isn't expecting JSON. Back in C# you can't do that so userJson becomes more useful.