Patrick Kelly

Posted on Oct 14, 2020

Avoiding Nulls with Extension Methods

#csharp #dotnet

I've programmed in Ada more and longer than C#. Something that always bugged me about any .NET or really any heavily object-oriented language was how your rate of null-dereferences shoots through the roof. Considering how I can probably have no null dereferences in a non-trivial Ada program, it was particularly weird hearing .NET programmers complain about this so much, and as I used C# more, it annoyed me as well.

But the problem isn't the languages themselves; although this statement is very pedantic. Rather, it's a result of the design patterns those languages encourage.

Let's show off a little about how things are done, and the problem should start to become clear.

class Person {
    void Move();
}

Person p;
p.Move();

The problem is you didn't initialize p!

Yes, in this specific extremely trivial case, and in the most pedantic sense, that is the problem. The point is, how we're calling Move(). Say p was actually an iteration variable in a collection of people. You queried a database and got sent back an IEnumerable<Person>. Or, it could be a deserialized object, even though that should be throwing an exception, but hey, most serializers aren't written well. We're getting p somehow and code flow analysis alone can't say it's not null.

That's just the way it is, right? This is why null chaining exists. What I really should be doing is p?.Move(), obviously!

But this doesn't sit right with me. As a developer primarily writing other libraries, I try to make the callsites as easy and error-proof as possible. Forgetting to use ?. instead of . is just too easy. It's not easy to grep the problem. And putting ?. everywhere is actually problematic for its own reasons.

But that's just the way it is, right? No!

type Person is tagged null record; --fieldless "class"

procedure Move(Self : in out Person);

p : Person;
p.Move; --object.Method() style
Move(p); --Procedure(object) style

Neither of these will throw a null dereference exception!

Now a huge part of this has to do with language semantics. Ada initializes all objects upon declaration even without an explicit declaration, although analyzers will generally report this with a warning or error still. Obviously we can't change C# semantics. So while this is great, we can't do much about this.

There's a solution and good lord is it obtuse. If you're cranking out products for clients then this isn't going to be the least bit useful for you. But if you're a library author this may be of interest.

Notice one thing about Ada's function/procedure declaration. Even if it's dispatching off a type, we're still explicitly declaring that parameter. Do we have something like this in C#? Yes! The extension method. So let's try this again.

class Person {
    void Move();
}

static class Operations {
    static void Move(this Person person) => person.Move();
}

Person p;
p.Move(); //This is still Person.Move, not Operations.Move

Uh oh. Problem is, instance methods are preferentially resolved; they are checked before extension methods. We could using static Operations and call with Move(p) everywhere, but that's very anti-C#. There's a solution though, and it's a simple visibility change.

class Person {
    internal void Move();
}

static class Operations {
    static void Move(this Person person) => person.Move();
}

Person p;
p.Move(); //This is Operations.Move now!

Fantastic. Although we're only redirecting a call at this point. We'll still get a null dereference exception. How do we gracefully handle null?

That's not actually a joke. There's a simple calculus to explain this, and it's so simple I don't even need to get all theoretical, just a bit metaphysical.

If we move nothing, what happens? Well, nothing happens. What is null? A pointer/reference to nothing. So an operation upon nothing does nothing. We can codify this like so:

static void Move(this Person person) {
    if (person is null) {
        return;
    }
    person.Move();
}

Now a call to p.Move() will either move p as described by Person.Move() or will do nothing.

If you have a return type, it's a little different. But the same line of metaphysical thinking reveals the answers. If you're counting occurrences of something in a collection, and the collection is empty, how many occurrences are there? Well, 0, obviously. More tangibly, if you're counting the amount of apples you have in a bag, and the bag is empty, how many apples are there? 0. If you're counting the amount of apples in a bag, and you don't have the bag, how many apples are there in the bag you don't have. 0. Remember, it's a bag that doesn't exist, not a bag that you don't own.

Do I see this as useful for most people? No absolutely not. More than anything, this is a neat curiosity. But it's certainly something for library authors to keep in mind. I can definitely see this being interesting for things which focus very highly on usability, as this helps avoid many cases of errors. But do note that not every method should be "null proof" like this. There are methods which absolutely should be called off the instance itself and need to throw a null dereference exception as part of their semantics. This is just a neat tool for those looking to increasingly harden their API's.

Top comments (2)

Zohar Peled • Oct 15 '20 • Edited

Of course, your trivial extension method can be simplified to a single line using the ?. operator - which of course comes from the same way of thinking - void operations on null references are no-ops...

Patrick Kelly • Oct 15 '20

?. isn't a no-op. It's an expression which returns null. Now, you've got a null expression sitting there which either needs to be discarded in the case of an intended no-op, or downstream needs to put the default value rather than the method internalizing the default value.

Is this syntactic sugar for things that can already be done? Yes. But then every language construct is syntactic sugar for things that can already be done.