Abstract
absl::StatusOr<T> is a widely-used type that optionally contains a value (of type T). If it does not contain a value, it contains an absl::Status. It is similar to C++23 std::expected, but with a hardcoded unexpected type of absl::Status and a different API. It is also (to a lesser degree) similar to std::optional in that it optionally contains a value.
It is invalid to deference an absl::StatusOr that does not contain a value. The value of an absl::StatusOr<T> object called sor can be accessed by
*sor or sor.value()to get a T&sor->foo()to call T::foo.
There are other APIs related to absl::StatusOr, e.g. CHECK_OK to crash the program if there is no value. In this RFC, we propose a check for code that accesses absl::StatusOr values without ensuring they contain a value. The check also aims to support related APIs like CHECK_OK.
Example code that will be flagged by this check
bool foo(absl::StatusOr<int> bar) {
return *bar < 5; // NOT OK
}
Example code that will not be flagged this check:
bool foo(absl::StatusOr<int> bar) {
if (!bar.ok()) return false;
return *bar < 5; // OK
}
Proposal
Use the dataflow framework to add a new ClangTidy check, abseil-unchecked-statusor-use, that detects expressions in which the value is accessed, but where we cannot prove that the access is safe from the control flow. We include calls to absl::StatusOr::value which are defined to throw an exception, because we empirically have found that it is used interchangeably with the deference operators (which lead to undefined behavior if called when no value is present), not because the developer intended to crash the program.
We (Google) have deployed a downstream check to various projects, with good success. I would like to upstream this check for external users of Abseil to also benefit from it. It is similar to the existing bugprone-unchecked-optional-access check, and shares some of the code (mostly around return value caching). The API surface of absl::StatusOr is sufficiently different to prevent a common parametrized check.
By using the dataflow framework, we are able to reason about control flow
bool safe = false;
if (sor.ok() && SomeOtherCondition()) {
safe = true;
}
// ... more code...
if (safe) {
use(*sor); // OK
}
The check is strictly intraprocedural. This is both preferable from an API design perspective, and also makes implementation of the check much easier. For interprocedural reasoning, the type system should be used to communicate whether an absl::StatusOr contains a value or not. If a absl::StatusOr<T> always contains a value, a T& should be used instead.
That being said, the check models the return value of zero-argument const methods (as a proxy for getter methods) as being stable if no non-const method is called on the same object between them. This is to support common patterns like
class Foo {
public:
absl::StatusOr<int> get_bar() const;
void clear_bar();
};
Foo foo = get_foo();
if (foo.get_bar().ok())
return *foo.get_bar(); // OK
if (foo.get_bar().ok()) {
foo.clear_bar();
return *foo.get_bar(); // NOT OK
}
While nothing requires this assumption to be true, we have found this a good tradeoff to make the check easier to deploy. We have not encountered code that violates this assumption.
Limitations
There are two major limitations in the current check, which I expect to be present in the initial upstreaming. The same limitations exist in the bugprone-unchecked-optional-access check, which is already part of clang-tidy. These limitations have not been a significant hurdle to our adoption of this check.
We do not model accesses to containers, so we incorrectly flag the following code
if (sors[0].ok()) {
use(*sors[0]); // false positive
}
We do not accurately model captured variables, so we incorrectly flag the following code:
if (sor.ok()) {
[&sor]() {
use(*sor); // false positive
}
}
The container code could be rewritten as follows, which also has the advantage of getting rid of the repeated container access, which is harder to optimize
if (auto& sor = sors[0]; sor.ok()) {
use(*sor); // OK
}
The lambda capture code as follows, which makes the types inside of the lambda clearer
if (sor.ok()) {
[&s = *sor]() {
use(s); // OK
}
}
Future Work
Follow up work from this would include
- Improve modelling of containers
- Improve modelling of lambdas
- Implement a similar check for std::expected