Null pointer exceptions hell
Introduction
Null pointer exception (NPE) is known by different names in different programming languages. Null pointer dereference in C/C++, NullPointerException in Java, NullReferenceException in C# and .NET, and many other names in scripting languages like JavaScript’s “undefined is not a function” among them.
This error is in the top of programming bugs of all times. It’s a plague that exists in each and every application because of how popular programming languages and programs work from the ground up. As a software developer you either have to deal with null pointer exceptions every day, or they are just like mines silently waiting in code for a software user to step on it. Null pointer exception error is a “billion dollar mistake”:
This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
Tony Hoare from the talk “billion dollar mistake”
Examples
The history of null pointer dereference started long time ago even before the UNIX/C era. To get the taste of null pointer exceptions, the examples provided are written in the top most used programming languages at the moment.
For the sake of simplicity the null (or undefined) variables are explicitly set. In real programs the null value will come as a result of an operation provided by another piece of code like another part of the program, a 3rd party library, framework or an OS. To make it worse, the value can be null (or undefined) at random based on various user input, database state, a device state or other environment conditions.
C/C++
Consider this code:
QTimer *timer = nullptr;
timer->start(); // crash
Put this code into a console application to observe this:
Java
String name = null;
name.toLowerCase(); // crash
Inside a JSP page powered by Jetty:
C# .NET
string name = null;
name.ToLower(); // crash
Results in an infamous “yellow screen of death” in ASP.NET:
Null pointer error in dynamic programming languages
Popular dynamic scripting languages don’t have a concept of a pointer, but they have references. Dereferencing a null reference usually produces the same result - terminal failure. Moreover these programming languages have a concept of “undefined” (or “unset”) value. Technically speaking null
and undefined
are separate things, but in practice undefined
values produce almost the same class of “null pointer exceptions” as true null
values. Thus dynamic languages have even more null exceptions, because null
and undefined
are friends.
JavaScript
var element = null;
element.getAttribute("id");
// crash
var element = { getAttribute:null }
element.getAttribute("id");
// crash
Errors in a browser console:
PHP
$name = null;
$name->getMessage(); // crash
Python
name = None
name.lower() # crash
What a lovely Django web page:
Ruby
name = nil
name.downcase # crash
Testing in a Rails controller:
Perl
my $person;
print $person->name; # crash
Chart
Let’s summarize the results of those obviously wrong little programs presented so far in a table:
Example | Compiles* | Crashes |
---|---|---|
C++ | YES | YES |
Java | YES | YES |
C# | YES | YES |
JavaScript | YES | YES |
PHP | YES | YES |
Python | YES | YES |
Ruby | YES | YES |
Perl | YES | YES |
*For dynamic languages “compiles” means that the syntax checker (or a bytecode compiler) is not able to detect the problem ahead of time.
Solutions
Manual null-checks
As a workaround programmers insert null-checks into the programs like so:
if (name != null)
name.toLowerCase();
else
... // handle an error case
Such ad-hoc checks are not a general solution to the null pointer exception problem. It’s plain impossible to add such checks everywhere. It’s a burden to a programmer and it is error-prone to the human factor.
Sometimes being overconfident that a null
value is impossible in a particular place you avoid checking, and much later get trapped by a null pointer dereference. On the flipside after experiencing lots of null pointer errors people fall into defensive programming and tend to add tons of checks whether it is needed or not.
Static analyzers
Static analysis with tools like Lint might help to find potential null pointer exceptions. Building such tools is a very complicated endeavour, because they must be smarter than the compiler (or parser) itself and infer missing information from the source code. Since languages allow null values everywhere, the static analyzers tend to produce lots of false positives. This makes fixing errors very time consuming and daunting. In addition a static analyzer being a separate tool does not prevent you from shipping bad code. Forcing warning-free passes upon developers can be a hard sell since the analysis is so fuzzy.
NonNull annotation
As an afterthought some programming languages introduced additional syntax (annotations or attributes) that hints compilers if something can be null or not. Java has @NotNull
in various forms, C# has NotNullAttribute, Objective-C has nonnull
.
Those annotations can help a bit, but it’s not a silver bullet. Most of the code was written without NonNull. It will take ages to standardize and add it everywhere. If a programming language uses null
value as a default value, the annotations solution is an opt-out solution. It means that most of the cases are going to be left unpatched.
Option type
Option types are known as Nullable<T>
, Optional<T>
, Option<T>
or Maybe T
in different programming languages (where “T” denotes a data type). Such type lets you express an explicit intent that a value can potentially be null and requires an explicit cast if you want to get to a contained non-null value.
The option type protection works solidly in languages like Haskell or F# where a cast to get the non-null value requires adding an “if-else” clause which would deal with a null case. On the contrary, in the languages presented above though it’s very easy to try getting an underlying value without a check and be trapped by a null pointer exception error. In addition for those languages the nullable type solution is still opt-out like with the NonNull annotations.
Ignoring null pointer exceptions
The programming language examples discussed above are on one side of extremity: calling a null pointer produces a runtime crash, and you have to opt-out case by case to recover. On the other side of extremity is an idea to ignore a null dereference and continue execution. This is an opt-in solution where you have to produce an exception yourself if you need to.
For a programmer who is used to deal with null pointer exceptions this might sound crazy, but this idea has proven itself to work well in production. Objective-C runtime and thus most Mac OS and iOS applications are running this way. Several things are great about this strategy:
- A program can recover itself from an unexpected error if a higher level code works with the null condition. In other cases an operation finishes as if nothing has happened, because later calls to
null
are ignored too. - Disgusting null pointer exception errors are not thrown upon a user.
- The app stays alive and might continue doing other useful stuff.
- Null-checks can be avoided producing more compact and readable while still correct code.
More examples
Objective-C
Objective-C doesn’t suffer from null pointer exceptions by simply ignoring it when it happens. For example:
NSString *name = nil;
[[name lowercaseString] characterAtIndex:0];
In this chain of calls it first calls a method lowercaseString
and then calls a method characterAtIndex
on a return value. Since “name” is nil
the first call is ignored and produces nil
. Then the second call is ignored as well.
Objective-C is not an angel. Firstly, it is built on the C language foundation which abounds in null exceptions. Secondly, collections like NSArray and NSDictionary can’t contain null values, which might be coming in there by accident and produce “Attempt to insert nil object” exception that is pretty close to a null pointer exception:
NSString *name = nil;
NSArray *names = @[ name ];
// crash
Go
Go language takes an approach of favouring value types (like structs) to pointer types. When the value types are used a null pointer exception is not possible, for example:
var moment time.Time
moment.Day()
The value of the moment
variable is not defined, but it automatically gets a default value and it’s possible to call a method. Unfortunately nothing checks if this was intentional or not.
Go language is still vulnerable to null pointer exception errors when using pointers. Unfortunately even simple API functions are unsafe:
var timer *time.Timer = nil
timer.Reset(10) // crash
var timer2 time.Timer
timer2.Reset(15) // crash
The first example is a true null pointer exception. The second example is able to call into the Reset function, but crashes on a check inside it, saying that the timer2
variable value is not initialized:
Swift
Just as the Go language Swift favours value types. It is built on top of Objective-C foundation and the majority of APIs return nullable option types. Swift offers a safe idiomatic way to cast them with an if let
statement that forces an explicit check. In addition to that an operator “?.
” provides a shortcut where you want to ignore the null-case in the spirit of Objective-C:
let name : String? = nil
name?.lowercaseString.endIndex
// ignored
On the downside Swift has an exclamation mark operator “!
” and a similar “!.
” operator which ruins the deal:
let name : String? = nil
name!.lowercaseString.endIndex
// crash
This operator produces a null pointer exception that Swift tries to avoid so hard:
Conclusion
All modern popular widely used programming languages are vulnerable to null pointer exception errors: C++, Java, JavaScript, C#, Python, Ruby, PHP, Perl, Go, Objective-C and Swift. Some languages try harder to avoid it than the others, in particular: Go, Objective-C and Swift. This is “null pointer exception hell” - a reality that programmers have to deal with for upcoming years.
Bonus: exercise
Could you break SQL?
- An error should not be detected until execution time and should be related to NULL or undefined values
- Try to do it without using stored procedures
- Any popular SQL implementation is fine.
Cover image by mikalsl under CC BY.
Subscribe to get more articles about programming languages | |
|
|
Follow @battlmonstr | Donate |