This RFC is based on a suggestion by Martin Brænne (@martinboehme)
TL;DR
I propose a new Clang annotation [[clang::null_after_move]] to mark user-defined smart-pointer-like types that make guarantees on the state of a moved-from object, leaving it in a valid and specified state where it is safe to use but not dereference. The primary motivation is to avoid false positives in the clang-tidy check bugprone-use-after-move.
Motivation
It is desirable to detect use-after-move bugs through static analysis and clang-tidy has a check for this: bugprone-use-after-move. However, there are cases in which an object may be used after it has been moved from, that cannot readily be identified by the automatic check.
Notably, user-defined smart-pointer-like types typically make the same guarantees about the state of a moved-from object as standard smart pointer types (e.g. std::unique_ptr): Moving from such a smart pointer leaves it in a valid and specified state that matches the default constructed state, nullptr. In that state it is safe to perform many operations on the pointer, such as comparing it against nullptr or other pointers; the only operations that are not allowed are the ones that dereference the pointer, e.g. operator*, operator-> or operator[].
The bugprone-use-after-move check already recognizes the semantics of smart pointers defined in the standard and allows using moved-from smart pointers as long as they are not dereferenced. The set of types that this applies to is currently hardcoded. This RFC proposes an attribute to annotate user-defined smart-pointer types so that they receive the same treatment.
Example
As a real-world example, consider Chromium’s HeapArray, a replacement for std::unique_ptr<T[]> that keeps track of its size. Its move-from semantics are specified: it leaves the move-from object empty and containing no allocated memory. Despite it, the next example:
1 void TestFunction() {
2 auto buffer = base::HeapArray<int>::WithSize(20);
3 auto moved_to = std::move(buffer);
4 std::cout << "moved from buffer size: " << buffer.size() << "\n"; // moved from buffer size: 0
5 }
Will generate a false positive use-after-move warning:
test.cc:4:46: warning: 'buffer' used after it was moved [bugprone-use-after-move]
4 | std::cout << "moved from buffer size: " << buffer.size() << "\n";
| ^
test.cc:3:19: note: move occurred here
3 | auto moved_to = std::move(buffer);
| ^
Proposal
This RFC proposes a new attribute, [[clang::null_after_move]], for type declarations of smart-pointer-like classes with well-defined moved-from semantics compatible with standard smart pointers.
Because this attribute is designed to be flexible, it requires only that the logical state of the object is equivalent to nullptr after move. This allows the attribute to support types with multiple internal fields or encapsulated states that do not explicitly expose a nullptr interface, such as Chromium’s HeapArray referenced in the “Motivation” section.
Placing the attribute on a class signifies that a moved-from object is left in a valid and a specified state that is safe for use as long as it is not dereferenced. As with standard smart pointers, only operator*, operator-> and operator[] are considered unsafe accesses as they would be dereferencing a nullptr.
Once the attribute is available, it can be used by the bugprone-use-after-move check to avoid false positives on benign uses of smart-pointer-like types.
Alternatives considered
Provide a new tidy option
Provide a new tidy option (e.g. NullAfterMoveTypes) for bugprone-use-after-move, where the user can configure classes with specified move-from semantics that are compatible with standard smart pointers.
This option has been discarded because it doesn’t scale well. For big codebases, it’s not a great user experience having to add the type to a central location that may be far away from the code that actually implements the type. Seems a better solution to add the annotation directly on the type and have it be visible there.
Teach the static analyzer the smart pointer semantics
The check bugprone-use-after-move could deduce through static analysis that the object is default-constructed after the move. This analysis may be error-prone (leading to false negatives) and likely involve a too costly runtime for a clang-tidy check.
And, even if it can be deduced through static analysis that the object is default-constructed after the move, it’s impossible to know whether it is an intentional part of the API contract the users of the class can rely on, or it is an implementation detail subject to change, and use-after-move should be flagged, even if the object happens to be left in default-constructed state at the moment of the check.
Use a Function attribute
The current proposal is to annotate declaration types. Although there has been a conversation on whether annotating individual operations could be a better option.
Annotating operations involve solving the question: are the safe operations annotated? or the ones that aren’t safe? Both options come with some difficulty:
-
Annotating the operations that are safe means annotating quite a lot of operations because the majority of operations on a moved-from smart pointer are expected to be safe.
-
Annotating the operations that aren’t safe requires fewer annotations, but it’s a bit strange in that the default is for an operation to be unsafe after a move.
At this point the most straightforward approach here seems to be still to annotate the class.
Use two attributes
Use a combination of 2 attributes to overcome the difficulties of using just a function attribute:
- A class attribute (e.g.
[[clang::specified-after-move]]) to mark user defined types as left in a specified state after move, and therefore allowed to be used. - A function attribute (e.g.
[[clang::unsafe-after-move]]) to overrule the class attribute and mark methods as unsafe to use after move. Note that the attribute would not have effect in types without the first attribute.
The 2 attributes solution is clearer on the semantics and more flexible, allowing to annotate types beyond smart-pointer-like types. But it comes with the disadvantages of adding 2 new attributes to Clang, whose interaction must be understood, and more annotations to user-code.
The disadvantages outweighs the flexibility of the solution for the problem this RFC is trying to solve, which is specific to smart-pointer-like user defined types, where unsafe dereferences are expected to happen through well-known operators in most of the cases.
Name of the attribute
There has been an open discussion regarding the name of the attribute, with several options in the mix:
-
[[clang::default_constructed_after_move]]: The original proposed name, oriented to be applied to smart-pointer-like types that generally leave the object default constructed after moves. Discarded because it fails to clearly communicate the specified state of the pointer type (nullptr). -
[[clang::usable_after_move]]: Considered too vague as any moved-from object is arguably usable, in that it is in a valid but unspecified state. -
[[clang::specified_after_move]]: An alternative to convey more clearly that the moved-from state is not just guaranteed to be valid but is precisely defined, although fails to communicate what is that specified state, leaving unclear if the tidy check can warn on an attempt to dereference the smart pointer.