There are cases where LLVM is able to detect some UB but clang is not.
For example,
unsigned int foo(unsigned int x) {
int ret = 0;
for(int i = 0; i <= 32; ++i)
ret += x >> i;
return ret;
}
When the loop is unrolled, LLVM InstructionSimplify will catch it and
return a UNDEF value.
How can we let LLVM report some warning message to help developers
correct the error?
Or should we use similar behavior as GCC (e.g. x >> 32 returns 0)?
This can also saves compiler engineer's effort: users complain that it's
a compiler bug because their code works with GCC or older version of
LLVM (because the loop is not unrolled). And it's really hard to debug
such UB in some large code base.
Yes, thought of UBSan. But in our case, the target program runs on baremetal. It has very tight code size restriction and it has no stderr.
Since LLVM already caught the behavior during compilation, it should notify users about it.
First, without debug info all LLVM could realistically say is "there
might be undefined behaviour somewhere in this program". Even with
debug info, that location may or may not be accurate.
Second, there could be legitimate cases for an shl 32 to exist in a
program. As long as it's not actually executed it's fine. The usual
example is a template instantiation: they fairly often have generic
code that would be UB in some cases, but is guarded by checks to never
execute at runtime. There's no realistic way for LLVM to determine
this locally.
Third, we don't want the diagnostics Clang produces to depend on
optimization level.
If you want this kind of diagnostic at compile-time, it's probably a
job for the static analyzer. It doesn't currently catch this case
though, and I don't know enough about its inner workings to say how
feasible that would be.