July 2022 – Colour Coding

So my friend Steve wrote a long article on why he doesn’t like nulls in C#, and I promised him a good fisking so, of course, the first thing I did last night was to fire up the Ms Marvel finale. It’s very good. This was following by a mini-marathon of black-ish episodes (it’s also very good) and honestly I highly recommend watching those in preference to reading this or Steve’s original article. However, because I’m the living embodiment of XKCD 386, sooner or later I was going to write something on the subject.

Let’s start with his first code example:

Person p = null;
Console.WriteLine(p.Name);

He’s correct in saying that this code would blow up at runtime with no compile-time warnings, and that using null reference types would catch this error. But he fails to include an example of how this would look using Functional.Maybe

Maybe<Person> p = Maybe<Person>.Nothing;
Console.WriteLine(p.Value.Name);

So, what’s different? Well, now C# 9 won’t catch the error and you’ll get a runtime exception. So the Maybe solution is, out of the gate, worse than the straight-forward regular C# example in that it can’t even catch the simplest example in the blog post.

Now, I’m being somewhat disingenuous in that there are other constructions of Maybe that don’t have this problem and indeed the code Steve employs later on in the post doesn’t display this problem. There are other issues, but describing everything wrong with language-ext would take me much longer than I plan to spend and, in any event, I’m a little sick of hearing myself talk about it at length. Let’s move onto the second example:

public void ProcessPerson(Person p) {
    Console.WriteLine(p.Name);
}

Here the contention is that, even in modern C#, you could still be called with a null person. The proffered solution is to insist that we check things for null even when we’ve declared them as not null. This begs the question: why not just declare them as nullable if you think this is a real scenario? Which in turn begs the question: how often does this actually come up in code-bases that use pretty much pure `#nullable enabled? Having used it fairly heavily for three years, I can answer that question quite definitively: the only time I encounter this kind of a problem where something is declared as not-null but isn’t null is when I’ve forgotten to implement a mock. i.e. I’ve only really encountered it whilst writing tests. The issue was never in the code base proper and moreover, the very tests I was writing caught the problem. Code merged down to main? I honestly can’t think of a single instance where this has been a problem.

In theory, yes, programmers can make mistakes and send you invalid values. However, this isn’t a problem unique to null handling and we handle this by being explicit about our expectations. The difficulty with C#7 and earlier where you just didn’t know if something was meant to be nullable or not is already gone.

The acute will observe that the proposed solution: Maybe, doesn’t actually address the problem outlined in the code example, because that’s meant to be employed in circumstances where there might not be a value and this scenario requires a value. What solutions do exist?

Only use language constructs that cannot be null. (In practice, I don’t think anyone’s going to adopt “everything is a struct” as a convention in C# outside of one finance shop who will remain nameless).
Use a language that doesn’t have these problems. There’s exactly one, it’s called Rust and I highly recommend you check it out. Learning Rust probably isn’t going to help you with your C# coding issues, though.
Develop programming practices to reduce the impact of the problem.

Now, the last one is fine, but the most effective practice I’ve found to deal with this is pervasive use of nullable reference types.

Semantics, Shmemantics

Now we get onto one of the big ideas in Steve’s post:

The fundamental thing to observe here is that we don’t know why we were given a null value.
Steve, obviously

A more fundamental thing to observe is that this is true of every value we ever process. We don’t know why we got sent 5 as a security id either. There are programming techniques you can employ where you pass around values with their derivation but: Maybe is not one of them. The vagueness complained about holds just as true regardless of your representation of missing values.

The Meaningless of Null

null means nothing, so what intent can be applied to it?
Steve, again

The phrase “null means nothing” caught my eye because it reminded me of how unnecessarily difficult mathematicians used to make their lives before finally accepting the existence of zero. And leads me to another fundamental observation about all modelling: what the code means is entirely in our own heads and has nothing to do with the code or what it actually does. In this particular case, there’s a mental model that’s helpful and a mental model that’s unhelpful. Thinking of null as “nothing” isn’t a helpful mental model. A better way of understanding null in regular code is “This value is not present” or “This value was not supplied”. Going back to the intent, unless you have a model for the intent of “This value is present”, you’re basically tying yourself up in knots over nothing. (Or Nothing)

I’ll restate this point because it’s extremely important: if you have two things that are operationally identical, if you are treating them as semantically different you have a problem is in your modelling. Platonism, where things have meaning independent of their behaviours, has been discarded in philosophical circles for centuries. The other major popular expression of this idea is the transubstantiation of Eucharist, an idea which, whilst by no means discredited, has at the very least sparked actual wars between subscribers and non-subscribers to the notion.

It is worth mentioning the one operational difference between nullable reference types and Maybe constructs: you can have a “Maybe<Maybe<Person>>” but you can’t have a “Person??”. This can matter if you’re doing heavy levels of functional programming, but in C# I can honestly say it hasn’t come up.

Call Me Maybe

There follows a lucid description of how to use Maybe culminating in the following example:

public static void ProcessPerson(Maybe<Person> p) {
    p.Match(
        p => Console.WriteLine(p.Name),
        () => Console.WriteLine("Nothing to process."));
}

Let’s rewrite that back to regular idiomatic C# code:

public static void ProcessPerson(Person? p) {
    Console.WriteLine(p?.Name ?? "Nothing to process");
}

An obvious objection is “Well, this is just example code, what does it look like in real world scenarios at scale?”. The answer, sadly, is much, much worse. If you’ve got three Maybes, you’ve now got a chain of three nested callback functions. Not even the fancy syntax provided by `language-ext is going to save you from that horror.

Let’s double down on the point though: in the Maybe example, what happens if p.Name is null? The answer is, of course, that it prints a blank line. Is that the right behaviour? It’s an arbitrary example so who knows, but it probably isn’t. Another observation is that this could happen even without nulls if Name was an empty string or, even worse, a string of whitespace characters. But this goes back to my fundamental observation: Maybe or Option or whatever you want to call it isn’t going to save you from bad input data. Only good programming practice will do that.

You might be asking yourself at this point: so how do languages like Haskell handle this, given that they not only don’t have nulls but don’t even have control flow? The answer is “do notation” which is way beyond the scope of this post, but needless to say, C# doesn’t have it and isn’t likely to any time soon. A related observation is that while Haskell doesn’t have “null”, it most definitely has “undefined” or “bottom” which is, if anything, worse. Even weirder, it turns out that language-ext went to all the effort to introduce it to C#, which is a design decision I would love to discuss with the creator. So how does Haskell deal with this? In short, it doesn’t. There’s even a widely-cited paper explaining how you can, for the most part, just ignore the fact it exists.

Nulls That Map Database Tables

Steve then goes on to give an example of some code that could be remodelled to avoid the use of nulls. Whilst certainly the case that the class can be restructured as he describes for that particular purpose, what if the class is actually needed for multiple purposes and some of those permit nulls? What if, moreover, you don’t know the final purpose when you construct the object? These questions I raise are not just theoretical. A lot of the time, the best modelling of a database table is just to convert the table to its C# equivalent.

With all of that said, I’ll concede that yes, sometimes it’s possible to reconfigure things so that you don’t need to deal with missing values. However, you also have to consider the trade-off of whether or not the effort is worth it and if the use of a Maybe or null is actually a problem or if it’s just offending your sense of neatness.

What Have We Seen and Unseen?

Hopefully I’ve made the point that this is a nuanced question. As Steve mentions, Tony Hoare regards introducing pervasive null references as a billion dollar mistake and I’m not going to argue with him. But the fact remains that it’s a mistake we live with, unless you’re a Rust programmer (and don’t use unsafe code). We’ve also seen that an undesired null reference is actually just a small subset of a much larger problem of receiving undesirable inputs and that none of the coding strategies laid out in this or the original article will address this in general.

So really, we’re only left with two questions: what’s the best available technology for representing values that should not be null, despite the runtime supporting nulls and how should we represent values that may be missing, where we have no further information about why they’re missing? Ideally, the answers to both questions should satisfy the following requirements:

It should be idiomatic C# code.
A corollary of this is that it shouldn’t rely on a third-party library.
It should impose a low run-time overhead in high-performance scenarios.
It should have compile-time support.

The good news is, we have solution that satisfies all of these criteria: nullable reference types.

Month: July 2022

Nullable Reference Types are FINE, and here’s why