null is something… to care about
In case you feel familiar with the title… You’re right. I’ve shamelessly stolen it from a talk by Sandi Metz:
I’ve found myself recently in a situation in which using null was a bad decision, so I decide to research ideas about how to avoid it and model with a better approach. This article by Arho Huttunen about avoiding *null *checks helped a lot and also provided me with some useful links, as well as this other article by Yegor Bugayenko.
The null reference was introduced by Tony Hoare in 1965 when he was developing the type system for ALGOL W. And he regretted it:
This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years. (Tony Hoare (2009): Null References: The Billion Dollar Mistake, QCon, 2009)
So, in this post we will try to discuss why null can be a really big problem, and how to avoid it.
What is null anyway?
It is a strange fate that we should suffer so much fear and doubt over so small a thing. (Boromir in The fellowship of the Ring, J.R.R.Tolkien)
What is null? null is a pointer that doesn’t refer to a valid object. This is similar to saying that null points to nothing. Or, in simpler words: null is nothing.
For example, if we consider a linked list, in which every item in the list keeps a pointer to the next one, null is used to mark the end of the list. If this item points to nothing, then the list ends.
Also, one typical use of null is to signal that a certain object has not been initialized and is waiting to have a value. We have to put something there, at some moment. The sooner, the better. For example, when we define the properties of a class, we usually initialize them at construction time. Type systems can prevent us from not initializing them by failing when we try to use an uninitialized property. The code above shows some examples.
Those were situations in which null is nothing. Also, they have some traits in common:
- Usually, we don’t explicitly set them as null.
- Those nulls are internal to the object or data structure. The outer world doesn’t need to know about them.
But, sometimes null is something. And that’s very important when communicating with other objects in the system.
Null as “no result”
One frequent use of null is to represent that something is not found. This can happen in several situations.
Let’s consider a repository of entities. There are two main ways to get an entity from such a repository:
- by its identifier: we get one entity providing its identifier.
- by specification: we can get a set of entities that match certain criteria.
To be or not to be
In a first way, what we can expect is that one entity with that identifier exists or not. If it exists is not a problem at all. We retrieve it and do whatever we need.
But if the entity doesn’t exist we get nothing. It makes sense. If the entity with that identifier was never stored, then you will never find it. There are two big possible reasons: something happened that prevented the entity from being stored in the first place or someone is trying to fool the system by using a non-existent identifier.
So, we get nothing and the temptation to represent it as null is strong.
But it is also wrong because it forces us to check if there is something in the response. It seems kind of ridiculous to verify if my partner has given me something I asked her for. If she cannot give it to me, she simply will say “I don't have it”. I don’t need to check my hand to be sure that I don’t have the damn thing.
The repository should fail with an exception because an entity that we hope does exist does not exist is something that shouldn't happen. The consumer only has to deal with that problem or pass it to a higher level until the exception finds a proper handler.
Sorry, we don’t find any of those
In the specification way, the problem is slightly different. In this case, we can expect that there are no entities that fulfill the specification, thus the result can be nothing.
If there are entities that meet the conditions of the specification, then you will get a set of entities. Maybe in the form of an array, a set, a collection, or a similar container data structure. Possibly you will get the set and iterate over it to provide some functionality.
Here, nothing means empty. And that’s because this time nothing is something: the result is a set or collection that happens to be empty. If you iterate an empty collection, nothing will happen, but also, nothing would break. You simply inform the consumer that no results were found.
So, you don’t usually need to check if the result set is empty.
Null as “maybe…”
Sometimes we use null to express that something is optional and it can be there or not. Consider this situation:
We have a student assessment system that allows us to assess their tasks. A grade is mandatory, but you can also add feedback with a customizable message and one standard label.
Ok. Feedback is optional, so we can have an Assessment class with a mandatory Grade and nullable Feedback property. It sounds reasonable, but it leads to several problems.
Most of the time you have to check if Feedback is present. If you forget that, you will find a situation in which you will be sending a message to nothing, and the system will complain, throwing a Null pointer exception or its equivalent.
For example, if you want to show a list view of the assessments for a student, you will have to add a conditional to check if the Feedback exists and render something in the affirmative case and render something different in the opposite one.
The problem here is that No Feedback is also a Feedback and null doesn’t represent that properly, because null is nothing, and not having feedback is something.
You can avoid that by introducing the Null Object pattern instead. A Null Object is an object with the same interface that does nothing or provides some kind of default behavior. The Null Object can even fail with exceptions if it makes sense for your domain at some point.
Let’s see an example. We define an interface for Feedback:
This is the standard feedback object:
And this is the Null Object implementation:
This way, you can communicate with the Null Object the same way you do with any standard objects of the same type, sending the same messages and expecting a proper response from it:
And not having to worry about checking for null.
Null as “I can’t do this in that way”
Related to the previous topic, sometimes you query an object for some data to perform an operation with the object itself.
The point here is that if the data we asked for is null, the consumer has to decide how to manage the situation. For example, gathering another data from the object and trying to call again the same method or another one.
This problem comes from a code smell known as an anemic object. An anemic object is mostly a container of data with little or no behavior at all. In OOP we expect that objects combine data and processes, in such a way that you don’t need to ask them for data. You tell them to do things, instead.
A typical example is sending notifications to users in our systems. They usually have several ways to be notified (email, SMS, etc), but it is possible that they don’t have defined a destination for one or several of them. So, you need to ask the User or Customer object about every possible communication channel and decide if you can use one or another.
This is our anemic Customer model:
And this is a service suffering the consequences:
One of the possible solutions is to apply the Tell, don’t ask principle. This way, you can tell the User object to perform the notification and it will do it using the preferred or configured channel.
This way, the problems of managing the nulls are hidden to other components. So, NotificationService could do something like this, using a Double Dispatch:
Another solution is to re-think the problem and the solution design. If the User object cannot manage the communication, because it is the responsibility of another object, then the User object can provide the preferred communication channel or a Null Object if none is available. This way, the notification service won’t need to worry about checking if they are valid or not.
Trust-driven programming
While I was writing this post I started to think about defensive programming techniques. In general, defensive programming considers all inputs as dangerous, so you shouldn’t trust them. This way, you need to validate all inputs to your system from the outside, but also to your functions, applying sanitization and validation rules to accept them.
Instead of that, object-oriented programming proposes something that we could call trust-driven programming. Objects in the system are created valid and complete, and we should expect them to provide valid data, in the form of valid objects and that they will properly manage their concerns. Objects cooperate for the benefit of others.
When an object answers with a null, it introduces mistrust into the system. Mistrust leads to fear, and fear leads to the dark side… And bad programming.
Don’t force checking for null
Most of the time, the problem with null is the need to check if some object response is null. That means that you cannot trust in that response, and the lack of trust is always a problem.
Having to check for null requires the insertion of a conditional clause that increases cyclomatic complexity and introduces risks. What will happen if you forget the check for null? Or worst: What can happen if you pass along the null to an indirection chain?
So, as a general rule, you should not return null. Instead, try to manage it inside the objects using one or several of these techniques:
- Fail with an exception. This is the way to go when it is possible that the response doesn’t exist at all.
- Return Null Objects. This allows the consumer to interact with an object that can have some kind of neutral or default behavior. Empty containers, such as collections, can be considered null objects in the sense that they represent empty sets of data.
- Apply the “*Tell, don’t ask” *principle. Sometimes, having to deal with null is a design flaw. If you have to ask an object for some information to do something with the same object, there are a lot of chances that you can tell the object to do all the process itself, providing adequate behavior in the case that something is not there.
- Re-think your design. Maybe “Tell, don’t ask” doesn’t apply, but you can consider other options or patterns to redesign the code.
Top comments (0)