TL;DR: proposal to allow _Nonnull
attributes on C++ smart pointers: unique_ptr
, shared_ptr
, and anything marked [[gsl::Pointer]]
. Draft patch
@DougGregor as author of _Nonnull
and friends. @erichkeane as code-owner for attributes.
Background: nullability attributes
It’s 2024, and null pointer crashes are still everywhere, often due to confusion of whether a pointer is allowed to be null. Clang supports the nullability attributes _Nullable
and _Nonnull
[1] to explicitly specify whether null is considered a valid value:
void Print(Document* _Nonnull doc, Font* _Nullable custom_style);
These serve several purposes:
- they document the intended contract of an API in a regular way
- clang statically detects simple violations (
-Wnonnull
,-Wnullable-to-nonnull-conversion
) - UBSan dynamically detects violations (
-fsanitize=nullability
) - clang-based tools consume the annotations to better understand the API (e.g. Swift/Objective-C interop, crubit nullability checker)
Nullability and smart pointers
Today these attributes are only allowed on built-in “raw” pointer types. In C++, many of the pointer types in APIs are instead instances of smart-pointer classes.
std::shared_ptr<Scheduler> _Nonnull getDefaultScheduler();
// error: nullability specifier `_Nonnull` cannot be applied to non-pointer type 'std::shared_ptr<Scheduler>'
We’d like to allow these attributes there too, as all the benefits of annotating raw pointers apply to smart pointers too. The basic design of _Nonnull
as a type attribute applies equally well to smart pointers.
Identifying smart pointer types
Not every type is nullable: int _Nullable x;
is nonsense that we should continue to diagnose. So we need to identify the pointer-like types we do support.
The most important smart pointers are unique_ptr
and shared_ptr
from the C++ standard library. We would hard-code these names. I’m not sure about std::function
and would conservatively leave it out.
User-defined smart pointers are common too. Clang already knows about the [[gsl::Pointer]]
attribute for pointer-like objects, and I suggest we accept all classes marked with that as pointers.
Details: UBSan
To have -fsanitize=nullability
check smart pointer arguments and return values, we need to teach it how to check the nullness of a smart pointer value.
The most obvious thing is to find operator bool()
on the pointer class and synthesize a call to it. If such an operator doesn’t exist, we wouldn’t do any checking.
I’d leave it out of the initial scope though, to keep things small and because it needs slightly different expertise to write/review.
Details: pragma and completeness
Ideally, all pointers are annotated, most pointers are non-null, and we should limit the noise added by annotations. Clang has features to support this:
- The
#pragma clang assume nonnull
directive marks pointers as non-null by default within a region of code. - When only a subset of pointers in a header are annotated,
-Wnullability-completeness
will warn.
However I don’t think we should apply these features to smart pointers, at least for now:
First: because it would change the meaning of existing programs.
Second: it’s unclear how useful the pragma is for C++ code. It attaches _Nonnull
to parts of declarations in a helpful but somewhat at-hoc way, and ignores typedefs. This makes sense in C/Obj-C, but in C++ regularity and composition of types become more important.
We’d like to leave this out for now, continue to experiment with pragmas out-of-tree, and revisit later.
This allows the attribute for the chosen smart pointer types, and extends both the on-by-default nullptr
=> non-null warnings and the off-by-default nullable => non-null warnings to work with smart pointers.
Background on our work
We’ve been working on a static nullability checker and inference tool based on the clang dataflow framework. This has happened outside of llvm-project
in part so we can first get deployment experience to confirm the approach works.
We’re rolling this out to our internal C++ codebase, where we’ve chosen to spell these types absl::Nonnull<int*>
etc. Because of the lack of smart pointer support, these aliases currently expands to approximately[2] int * [[clang::annotate("nonnull")]]
rather than int * _Nonnull
.[3]
If these attributes are allowed on smart pointers, we can switch to them. This means we can benefit from nullability support in Sema and UBSan, and help extend UBSan to sanitize smart pointers. It also removes barriers to upstreaming our checker.
(“We” is @bazuzi @gribozavr @martinboehme @ymand and myself).
[1] There’s also _Null_unspecified
and _Nullable_result
, which are similar but less interesting, and [[gnu::nonnull]]
which I’m not proposing to touch.
[2] In fact annotate
goes on the type alias rather than the type itself, this just saves a few AST nodes.
[3] We can’t include _Nonnull
for raw pointers only, because specializing absl::Nonnull<T>
for different T
inhibits template argument deduction.