(This RFC was co-authored with @Sirraide @erichkeane.)
Clang supports __builtin_assume
to inform the LLVM optimizer about assumptions it is allowed to make based on an expression passed to the builtin (https://clang.llvm.org/docs/LanguageExtensions.html#builtin-assume). C++23 added similar functionality in the form of the [[assume]]
attribute (https://wg21.link/P1774R8). This RFC is about the optimization behavior we’d like to see in both the builtin and the attribute.
There was a recent discussion (@llvm.assume blocks optimization) that observed that LLVM’s assumption capabilities (through llvm.assume
(LLVM Language Reference Manual — LLVM 19.0.0git documentation)) can have a surprising negative impact on program performance that is likely to trip up programmers of all experience levels. As one example, the libc++ developers added use of __builtin_assume
for conditions they assert in the STL and ended up removing that functionality (⚙ D153968 [libc++] Stop using __builtin_assume in _LIBCPP_ASSERT) because of surprising performance regressions. The consensus position on that discussion seemed to be that LLVM should be more aggressive about dropping assumption information (at least in the short term) with a longer-term plan in place ([RFC] How to manifest information in LLVM-IR, or, revisiting llvm.assume), but to wit, nobody is actively working on either approach. There were concerns raised in that thread about what level of support we should have for [[assume]]
in the presence of these unresolved issues.
The C++ standard gives implementations wide latitude in how to implement the optimization-related effects of this standard attribute, so we have options available to us in terms of how to proceed:
-
Have
__has_cpp_attribute(assume)
return0
and not lower any assumption information into LLVM IR. -
Have
__has_cpp_attribute(assume)
return202207L
and not lower any assumption information into LLVM IR. -
Have
__has_cpp_attribute(assume)
return202207L
and lower assumption information into LLVM IR viallvm.assume
.
Of the three options we see, we think Option #2 should be disregarded out-of-hand, as the effect is us telling users “we support this attribute” and then not doing anything useful with it in terms of optimization behavior. The standard expects 0
to be returned from __has_cpp_attribute
for this attribute in that case ([dcl.attr]).
Option #1 is a conservative option; it doesn’t expose users to a footgun while still giving us a path forward to provide support for assumption attributes in the future once LLVM has improved support for llvm.assume
(or provided different IR for us to lower to). The benefit to this option is that it gives us time and doesn’t give users a poor first impression of the feature, which may dissuade them from trying it again in the future. We have time to improve the QoI of our offerings while still being fully conforming and we have time to see how the attribute fares in other implementations (or whether it’s implemented in them at all – the feature was contentious with implementers within WG21). GCC implements the feature already while MSVC currently does not, which means we may need this option in -fms-compatibility
mode regardless of what we decide on. The downside to this option is that it doesn’t meet user expectations when it comes to portability of the feature and it may give the impression that Clang is not fully implementing C++23 (despite being a conforming implementation choice). It may be worth thinking about whether “portability” is practically meaningful in this case given that different optimizers will likely react differently to the same assumptions, so the syntax is portable but the effects on optimizations may not be. Note, this option does not preclude us from checking assumption attributes in a constant expression context should we find that valuable.
Option #3 is a more direct implementation in that it exposes functionality we already support. The benefit to this option is that it provides users with a portable syntax for specifying assumptions across various C++ implementations. It’s also straightforward for us to implement in Clang (as easy as Option #1 would be to implement). The downside is that the functionality we are exposing is not necessarily going to be in line with user expectations (which tend to set a higher bar for standards features than for implementation extensions), and then we’ll get bug reports that aren’t actionable for Clang developers (it requires LLVM to change rather than Clang and there may be unavoidable tradeoffs due to the design of LLVM that might not be practical for them to change). We could help mitigate some of these downsides by issuing a congratulatory diagnostic when using the attribute, along the lines of “assumption attributes may worsen performance rather than improve it; test the effects of this assumption under optimized builds”. It’s a bit obnoxious, but users can disable the diagnostic easily enough.
Additionally, we should decide whether we want __builtin_assume
and [[assume]]
to be synonymous or whether we want them to be different. For example, we could decide to not allow [[assume]]
to have optimization impacts but continue to do so for __builtin_assume
. Due to common usage patterns of putting attributes behind a macro for people who want highly portable code, this could perhaps be a sufficiently portable stopgap approach for users if we picked Option #1. However, due to that same usage pattern, if [[assume]]
has no effect on optimization behavior, we may drive users writing highly portable code to use the builtin and they’ll hit the same footguns we’re trying to help them avoid. We propose to leave the optimization behavior of __builtin_assume
as it is today.
To help get a sense of community sentiment, do you prefer:
- Option #1 & leave
__builtin_assume
alone - Option #1 & make
__builtin_assume
a noop - Option #3 & leave
__builtin_assume
alone - Other (see comments)