Nullability analyzer doesn't seem to work (and how to fix it)

Hi,

I’m working on a much simpler replacement for Clang’s nullability qualifiers. This proposal is a bit tangential to that, hence the new topic.

I found the existing nullability checker to be so limited when applied to top-level functions that a casual observer might assume it doesn’t work at all:

void test1(int *_Nullable x)
{
  *x = 5; // no warning!
}

int test2(int *_Nullable x)
{
  return *x; // no warning!
}

int *_Nonnull test3(int *_Nullable x)
{
  return x; // no warning!
}

int *_Nonnull test4(int *_Nullable x)
{
  int *_Nonnull w = x; // no warning!
  return w;
}

I was able to improve analysis by adding a NullabilityChecker::checkBeginFunction method which honours any _Nullable annotation of parameters in a top-level call.

However, I then faced a new problem: if a function with a _Nullable parameter has at least one caller, the checker behaves as though no other parameter value is possible. This is because the static analyzer doesn’t re-analyse functions as top level (with the exception of C++ copy and move operators, and ObjC methods). I was able to fix this by modifying shouldSkipFunction to return false.

After my changes, Clang-tidy even emits different warnings for the same line of code, if appropriate: “Nullable pointer is dereferenced” for a (simulated) top-level call and “Dereference of null pointer” for a call to the same function with a known null value.

Are these changes likely to be acceptable to the maintainers, and can anyone shed light on the intended behaviour?

1 Like

Hi, sorry for delayed reply, was distracted! That’s an interesting question with a somewhat frustrating answer.

Nullability qualifiers are very important in their current niche, but unfortunately they ended up very different from the naive dream of “let’s annotate which pointers can be null”. These nullability annotations were primarily built for Objective-C with the purpose of getting adopted by a massive amount of existing Apple APIs and frameworks, with the ultimate purpose of Swift interoperability where _Nullable-annotated pointers are represented as Optionals. In Swift there’s also an in-language support for checking optionals (if let, guard let and such), which enables the convention of always handling the absence of value before every use. There was also relatively little legacy code written in Swift back then (this is an “I was born at a very young age” statement) so they could easily force all the newly written code to properly check every optional.

In C languages there’s an enormous amount of legacy code that doesn’t follow the convention, so checking of _Nullable ended up being extremely limited as the warnings needed to be useful on existing code without being too loud to address.

Note that _Nonnull is very different from _Nullable, it offers very concrete and actionable information that the compiler can take advantage of. Even then, it doesn’t actually mean “the pointer can never be null”, more like “it’s not supposed to be null”. In particular, the code below is considered to be perfectly valid:

int * _Nonnull myPublicFunction(int *_Nonnull p) {
  if (p == nullptr) // this check is redundant right?
    return nullptr; // this should be never allowed right?
  ...
}

(Justified by “Well, we really think they shouldn’t pass null pointer into myPublicFunction(), but a lot of existing binary applications that link to our Objective-С library already pass null, and they won’t stop doing that just because we added an annotation, not until they recompile their binaries at least; so let’s keep this check around for this situation and keep doing what they expected us to do so that we don’t break them; we really need to do this and the compiler has no grounds to stop us.”)

The situation is much worse with _Nullable. Like, the whole idea of _Nullable is “this pointer can definitely be null sometimes”, but it doesn’t specify what “sometimes” means. It could mean “unpredictably” (eg., standard function gets() may return null when it fails with an input/output error, and there’s no way the caller will be able to predict that), it could mean “well, depends on the input” (for example,

_Nullable URL *makeURL(const char *address);

may fail when the address is not a valid URL but the programmer can also be sure that makeURL("http://example.com") will never return null), it could mean a million other possible underlying contracts. Examples like this are a relatively good indication that proper nullability annotations in C might be an unreachable utopia.

This is usually somewhat less of a problem when we’re talking about path-insensitive analysis, such as the one performed in compiler warnings. And indeed, note that tests 3 and 4 receive a compiler warning -Wnullable-to-nonnull-conversion:

<source>:13:10: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable-to-nonnull-conversion]
  return x; // no warning!
         ^
<source>:18:21: warning: implicit conversion from nullable pointer 'int * _Nullable' to non-nullable pointer type 'int * _Nonnull' [-Wnullable-to-nonnull-conversion]
  int *_Nonnull w = x; // no warning!
                    ^

(It’s definitely a valid question why don’t tests 1 and 2 cause similar warnings, I don’t know the answer to that.)

However, in the static analyzer situation, we’re talking about the same question asked on a per-path basis. In this situation the information “this pointer can be null” is particularly useless because it doesn’t tell us anything about whether the pointer can be null on the current execution path. And in a lot of cases, that’s where the programmer has additional domain-specific knowledge that the pointer “actually” cannot be null. This turns such static analyzer warnings into outright false positives when a path-insensitive warning could have been acceptable as a “coding convention violation” warning.

So basically back in 2015 the static analyzer people realized that _Nullable and path-sensitive analysis aren’t meant for each other. This is why only _Nonnull-related checks are enabled by default in the static analyzer but _Nullable-related checks are in permanent alpha and will probably never be productized. You can try to make these toy examples work, but you’ll eventually just run into such path-sensitive checks being impractical for this problem as it’ll lead to undesirable results every other time when the code has branches. It’s definitely somewhat interesting to see why these experimental _Nullable-related checks fail to cover these simple examples, but it’s unlikely that fixing these examples will be your biggest problem.


So if I was to suggest a way forward for a more direct and straightforward nullability annotation checking, I’d recommend going into a very different direction.

We already have different Clang attributes that offer harder guarantees than _Nonnull, namely __attribute__((nonnull)) and __attribute__((returns_nonnull)). These attributes invoke undefined behavior when a null pointer is passed through them, which the optimizer is allowed to take advantage of, so this is as “direct” as it gets. The static analyzer can also take advantage of them (and it partially already does, I think returns_nonnull still needs to be implemented though)

Maybe it makes sense to supplement them with a new attribute __attribute__((needs_null_check)) that’d mean “This value should be obviously checked for null before every use”. This attribute obviously wouldn’t have CodeGen implications; it’s still entirely about a coding convention. Then you can build a path-insensitive compiler warning that gets displayed every time when the null check isn’t obvious enough.

Such attribute would still suffer from some of the problems described above, such as the makeURL problem. You’ll also need to make sure that your warning accepts assert() as a valid null check (which may be tricky because assert() isn’t a first-class citizen, not until the Contracts proposal is accepted, but instead it’s a very convoluted combination of preprocessor directives and multiple nested control flow constructs, which may be entirely preprocessed out depending on the user’s build configuration).

But if you find enough people who want to accept such coding convention, this could be a way forward. And I cannot stress this enough, anything you do should start with people, it doesn’t make sense to create a major compiler feature until you find enough people to use it. And they’ll need to understand the makeURL problem and be fine with it.

1 Like

Hi, thanks for your reply.

Nullability qualifiers are very important in their current niche, but unfortunately they ended up very different from the naive dream of “let’s annotate which pointers can be null”.

I don’t think it’s necessarily a naive dream – however, it does require programmers to accept the trade-off that they might need to write code more defensively if they use a stronger type system. This doesn’t seem to be an issue for Python programmers using MyPy, for example, and Python previously had no type safety compared to C.

In C languages there’s an enormous amount of legacy code that doesn’t follow the convention, so checking of _Nullable ended up being extremely limited as the warnings needed to be useful on existing code without being too loud to address.

Sorry, but I don’t quite understand/accept the point you are making here. Presumably existing code which hasn’t been updated to use _Nullable doesn’t have a problem, by definition. This must have been some kind of compromise driven by users unwilling to update their code.

Note that _Nonnull is very different from _Nullable, it offers very concrete and actionable information that the compiler can take advantage of. Even then, it doesn’t actually mean “the pointer can never be null”, more like “it’s not supposed to be null”.

Thank goodness! :slight_smile: I’ve no interest in making software more broken.

And indeed, note that tests 3 and 4 receive a compiler warning -Wnullable-to-nonnull-conversion

Ah! That’s a bit better. I didn’t know that warning needed to be explicitly enabled. This is a sure recipe for newcomers to assume that the whole mechanism is completely broken, as opposed to only partly broken.

(It’s definitely a valid question why don’t tests 1 and 2 cause similar warnings, I don’t know the answer to that.)

I fixed it with these two patches: ⚙ D142743 Fix nullability checking of top-level functions and ⚙ D142744 Re-analyze functions as top-level

However, in the static analyzer situation, we’re talking about the same question asked on a per-path basis. In this situation the information “this pointer can be null” is particularly useless because it doesn’t tell us anything about whether the pointer can be null on the current execution path.

I think this highlights a terrible flaw in the way that Clang’s static analyzer works. I don’t want to know whether a pointer can be null for a given execution path within a particular program, I want to know whether it can be null given the abstract interface(s) specified by the programmer for the function(s) being analyzed.

Treating all functions which have at least one caller as though those caller(s) are the only caller(s) which could possibly exist seems fundamentally broken.

And in a lot of cases, that’s where the programmer has additional domain-specific knowledge that the pointer “actually” cannot be null. This turns such static analyzer warnings into outright false positives when a path-insensitive warning could have been acceptable as a “coding convention violation” warning.

I’m aware of that issue, but like I said, it doesn’t seem to bother Python programmers. I want the analyzer to do what it can with the information available, not make dangerous assumptions that code is safe just because it doesn’t have the domain-specific knowledge of a human.

(Sorry for my late reply, by the way.)