We have a lot of test cases that have UB inadvertently. These tests cannot be verified with Alive2 because technically the compiler is allowed to produce whatever it wants. So we lose a lot of goodness. (as a reminder, Alive2 has found 100+ bugs just by running it over LLVM’s tests)
Moreover, in the future, unwanted optimizations that detect UB may kick in and we end up not testing what we want.
For example, we should avoid these in tests:
- div v, undef / poison / 0
- br undef / poison
- load ptr undef / poison
- store val, ptr undef / undef
- memcpy ptr undef / poison
- getelementptr undef/poison
- (any memory access operations with undef / poison pointers)
For example, right now we have 4970 occurrences of “br undef” in the tests directory. As an example, I’ve removed “br undef” from InstCombine tests today ([NFC] remove 'br undef' from InstCombine test cases · llvm/llvm-project@952e069 · GitHub).
Please make sure new tests at least are free of UB. Thank you!
If you want to help, please remove UB from your favorite tests! We also need to patch llvm-reduce to avoid introducing so many undefs. Usually adding a new argument to the function is a better replacement for a function than undef (doesn’t trigger UB straight away anywhere).
I think we may be able to use Alive2 to catch at least some of those instances (and other UB that may be a bit more difficult to grep for).
I put together a simple pass that replaces all functions with
unreachable and a script that checks if the result still verifies, in a way that can be used as drop-in replacement for
opt when running llvm-lit. The code is here: ⚙ D127607 Add pass to turn function into Unreachable and script to find UB. (WIP)
I went ahead and fixed a few cases that seemed straight-forward
I disagree. We still need tests that we handle and codegen these correctly. For codegen testcase reduction, replacement with undef is substantially more useful than adding arguments.
Codegen and IR tests are different things. I’m not advocating we should remove all instances of undef from the test suite (not until we get rid of undef in the IR).
Still, given that ‘br undef’ and ‘store ptr undef’ are UB, there’s no value in testing that they generate the “right” assembly, because there’s no such thing for UB. We need to make sure LLVM doesn’t crash, but that’s it.
Didn’t bugpoint (use to) replace branch conditions with undef as one of its strategies? That would explain the source of a lot of those. I know that we have a lot of
br undef in AMDGPU tests, which always struck me as a bit iffy but I never found the “activation energy” to dig into it. Most of these tests should just be converted into branching on a function argument.
Yep, both bugpoint and llvm-reduce introduce a lot of undefs in branching and as pointers.
In the past the thinking was that ‘br undef’ was a non-deterministic jump to either branch, but we’ve settled on UB now.
I totally agree that we should (have the option to) not emit static UB into test cases created by llvm-reduce/bugpoint. This came up over the years multiple times. For me, it matters as the Attributor will exploit UB in the future better and thereby make tests useless.