Please don't use "br undef" in tests (aka please avoid test cases with UB)

nlopes · June 10, 2022, 9:51pm

We have a lot of test cases that have UB inadvertently. These tests cannot be verified with Alive2 because technically the compiler is allowed to produce whatever it wants. So we lose a lot of goodness. (as a reminder, Alive2 has found 100+ bugs just by running it over LLVM’s tests)
Moreover, in the future, unwanted optimizations that detect UB may kick in and we end up not testing what we want.

For example, we should avoid these in tests:

div v, undef / poison / 0
br undef / poison
load ptr undef / poison
store val, ptr undef / undef
memcpy ptr undef / poison
getelementptr undef/poison
unreachable
(any memory access operations with undef / poison pointers)

For example, right now we have 4970 occurrences of “br undef” in the tests directory. As an example, I’ve removed “br undef” from InstCombine tests today ([NFC] remove 'br undef' from InstCombine test cases · llvm/llvm-project@952e069 · GitHub).

Please make sure new tests at least are free of UB. Thank you!

If you want to help, please remove UB from your favorite tests! We also need to patch llvm-reduce to avoid introducing so many undefs. Usually adding a new argument to the function is a better replacement for a function than undef (doesn’t trigger UB straight away anywhere).

fhahn · June 12, 2022, 9:35pm

+1

I think we may be able to use Alive2 to catch at least some of those instances (and other UB that may be a bit more difficult to grep for).

I put together a simple pass that replaces all functions with unreachable and a script that checks if the result still verifies, in a way that can be used as drop-in replacement for opt when running llvm-lit. The code is here: ⚙ D127607 Add pass to turn function into Unreachable and script to find UB. (WIP)

I went ahead and fixed a few cases that seemed straight-forward

https://reviews.llvm.org/rGb8d728a098b10f0f9afdc3fe8641b4678f57f1e6
https://reviews.llvm.org/rG684a82fbc5434c9673c396b53e09baf902259a54

arsenm · June 12, 2022, 10:51pm

I disagree. We still need tests that we handle and codegen these correctly. For codegen testcase reduction, replacement with undef is substantially more useful than adding arguments.

nlopes · June 13, 2022, 7:48am

Codegen and IR tests are different things. I’m not advocating we should remove all instances of undef from the test suite (not until we get rid of undef in the IR).
Still, given that ‘br undef’ and ‘store ptr undef’ are UB, there’s no value in testing that they generate the “right” assembly, because there’s no such thing for UB. We need to make sure LLVM doesn’t crash, but that’s it.

nhaehnle · June 13, 2022, 9:23am

Didn’t bugpoint (use to) replace branch conditions with undef as one of its strategies? That would explain the source of a lot of those. I know that we have a lot of br undef in AMDGPU tests, which always struck me as a bit iffy but I never found the “activation energy” to dig into it. Most of these tests should just be converted into branching on a function argument.

nlopes · June 13, 2022, 9:44am

Yep, both bugpoint and llvm-reduce introduce a lot of undefs in branching and as pointers.
In the past the thinking was that ‘br undef’ was a non-deterministic jump to either branch, but we’ve settled on UB now.

jdoerfert · June 13, 2022, 2:36pm

I totally agree that we should (have the option to) not emit static UB into test cases created by llvm-reduce/bugpoint. This came up over the years multiple times. For me, it matters as the Attributor will exploit UB in the future better and thereby make tests useless.

Topic		Replies	Views
Please don't use undef in tests (part 2) LLVM Project clang , llvm	22	1094	March 18, 2025
[GSoC 2024] Remove undefined behavior from tests GSoC gsoc2024	27	1311	September 26, 2024
beneficial optimization of undef examples needed LLVM Dev List Archives	7	87	June 19, 2017
The undef story LLVM Dev List Archives	61	204	July 6, 2017
[llvm-reduce] Reduction to undef/poison/null? LLVM Dev List Archives	9	206	September 3, 2021

Please don't use "br undef" in tests (aka please avoid test cases with UB)

Related topics