We have a lot of test cases that have UB inadvertently. These tests cannot be verified with Alive2 because technically the compiler is allowed to produce whatever it wants. So we lose a lot of goodness. (as a reminder, Alive2 has found 100+ bugs just by running it over LLVM’s tests)
Moreover, in the future, unwanted optimizations that detect UB may kick in and we end up not testing what we want.
For example, we should avoid these in tests:
div v, undef / poison / 0
br undef / poison
load ptr undef / poison
store val, ptr undef / undef
memcpy ptr undef / poison
getelementptr undef/poison
unreachable
(any memory access operations with undef / poison pointers)
Please make sure new tests at least are free of UB. Thank you!
If you want to help, please remove UB from your favorite tests! We also need to patch llvm-reduce to avoid introducing so many undefs. Usually adding a new argument to the function is a better replacement for a function than undef (doesn’t trigger UB straight away anywhere).
I disagree. We still need tests that we handle and codegen these correctly. For codegen testcase reduction, replacement with undef is substantially more useful than adding arguments.
Codegen and IR tests are different things. I’m not advocating we should remove all instances of undef from the test suite (not until we get rid of undef in the IR).
Still, given that ‘br undef’ and ‘store ptr undef’ are UB, there’s no value in testing that they generate the “right” assembly, because there’s no such thing for UB. We need to make sure LLVM doesn’t crash, but that’s it.
Didn’t bugpoint (use to) replace branch conditions with undef as one of its strategies? That would explain the source of a lot of those. I know that we have a lot of br undef in AMDGPU tests, which always struck me as a bit iffy but I never found the “activation energy” to dig into it. Most of these tests should just be converted into branching on a function argument.
Yep, both bugpoint and llvm-reduce introduce a lot of undefs in branching and as pointers.
In the past the thinking was that ‘br undef’ was a non-deterministic jump to either branch, but we’ve settled on UB now.
I totally agree that we should (have the option to) not emit static UB into test cases created by llvm-reduce/bugpoint. This came up over the years multiple times. For me, it matters as the Attributor will exploit UB in the future better and thereby make tests useless.