[llvm-reduce] Reduction to undef/poison/null?

Currently in llvm-reduce when we try to remove instructions, we’ll RAUW them with undef. But it seems nicer to have the final IR use null values (e.g. 0, zeroinitializer, etc) rather than undef. Should we attempt to use null values? Or why undef over poison?

Nicer because it’s less likely to introduce new UB? Or some other reason?

I /guess/ undef because it’s more vague/shows the value doesn’t matter to some degree?

(might be worth CC’ing some of the other folks who’ve contributed patches/implemented llvm-reduce - not everyone reads llvm-dev or does so frequently enough to see relevant threads in a timely manner)

Using undef/poison is problematic, because there are multiple ways this could cause new UB (e.g. branch on undef, passing poison/undef to a function with a noundef argument).

I’m not sure if using zero will work well in certain cases, because it can introduce UB as well (e.g. load from null, passing as nonnull argument).

I think ideally we would have a way to materialise values we know nothing about, but are not undef. Perhaps we could add some oracle function, but that would come with its own drawbacks.

Cheers,
Florian

I've been thinking we should be using `freeze poison`,
but i don't think this question matters for the patch at hand,
it should just stick to the current practice of using undef.

Roman.

What I usually end up doing when reducing by hand is to introduce a new function parameter to replace a deleted instruction, e.g.:

define @f(i32 %x) {

%y = add i32 %x, …

}

=>

define @f(i32 %x, i32 %y) {

<remove instr %y>

}

Though not all users will want this sort of reduction…

I agree with what other said about undef/poison: avoid them at all costs when reducing. It will just expose some other bug regarding handling of undef in optimizers instead of the bug you wanted to track down :slight_smile:

Nuno

I've been thinking we should be using `freeze poison`,
but i don't think this question matters for the patch at hand,
it should just stick to the current practice of using undef.

I like freeze poison. It conveys the idea w/o making things UB all the time.
It basically is an oracle w/o the side effects.

FWIW, when I ported tests to the Attributor, e.g., from ArgumentPromotion or IPSCCP,
I had to manually remove all the UB that made the test meaningless first. In general,
tests that contain statically provable UB are less likely to be meaningful over time
and/or be reusable.

~ Johannes

I frequently use llvm-reduce just to minimize a crash caused by some change and present that to the author of a change to look at. I don’t think that having tons of freeze poisons in a repro file is nice to work with. If a crash repros with a 0 as opposed to a freeze poison the 0 seems much more appealing to present.

We could add a flag to reduce to the various options here if people have different needs.

just want to add that my fork of llvm-reduce automates this trick, and has a few other nice improvements

https://github.com/regehr/llvm-project/tree/new-llvm-reduce-pass

I haven't yet done the "avoid undef/poison at all costs" thing, but will get around to it sometime hopefully.

John

I frequently use llvm-reduce just to minimize a crash caused by some change
and present that to the author of a change to look at. I don't think that
having tons of freeze poisons in a repro file is nice to work with. If a
crash repros with a `0` as opposed to a `freeze poison` the `0` seems much
more appealing to present.

We could add a flag to reduce to the various options here if people have
different needs.

Yeah, maybe. I think it's clear undef is often not the best choice.

I agree with the meta point here - use of undef will make the behavior far less predictable, and therefore make reduction more annoying. I think it makes a lot of sense to use a deterministic zero value instead of undef.

Is there any _disadvantage_ to doing that?

-Chris