I previously asked this question on StackOverflow, but I thought it might be better to ask here.
From what I have observed, LLVM does not appear to be performing optimization based on range analysis. Or if there is a pass for this, it is not being applied when running rustc -C opt-level=3
or clang -O3
. On the other hand, GCC has reliably performed this optimization in all of the tests I have done using Godbolt.
For example, take the following function (Godbolt). I would expect LLVM to optimize this function to a single ret
instruction.
void foo(uint64_t a, uint64_t b, uint64_t c) {
// Some pre-condition that puts limits on a and b
if (a > 2048 || b > 1024) {
// Using __builtin_unreachable() should have the same effect
return;
}
// Due to the previous condition, we know that a + b can not overflow
if (a + b >= c) {
return;
}
if (a >= c) {
// unreachable; If a >= c, then a + b >= c must also be true.
printf("foo");
}
}
Going into this, I was not expecting it to be able to perform complex range analysis within loops. However, I expected that this function would be fairly trivial for the compiler to optimize away. If someone had asked me, I would have been confident that LLVM would be able to optimize away this function. It seems strange that it does not perform such a simple analysis within the scope of a single basic block. Not having this seems particularly problematic for languages like Rust that rely on LLVM to optimize away unnecessary safety checks for release builds.
With all of the work that has gone into LLVM, it seems highly unlikely no one has attempted to implement this sort of optimization. Was this implemented, but then disabled due to issues or is there some other problem I am unaware of?
Edit: After some experimentation in Godbolt, none of the available x86-64 clang versions (x86-64 clang 3.0.0 through x86-64 clang 17.0.1) were able to perform this optimization. On the hand, GCC was able to perform this optimization on x86-64 GCC 12.1+.