When working on a couple of recent changes in SCEV, I stumbled on a couple of gaps around how we optimize the @llvm.umin intrinsics. If I remember correctly, we added these because they needed different poison propagation semantics than select, and that memory triggered a thought.
The min/max intrinsics have the same poison semantics as their select forms. It’s the undef semantics that differ.
Can you add wording about poison to the LangRef for those? At the moment, it’s largely unspecified.
However, I believe that the primary motivation for having the intrinsics is that maintaining canonical SPF form is hard. Especially many InstCombine optimizations need to be very careful about not breaking canonical SPF form, because doing so will likely result in an infinite combine loop.
I’ve disagreed with this line of argument from the beginning, but fine. I’m not going to bother getting back into that.
I want to raise the question of whether we should support both possible poison propagation semantics for the select instruction. As a brief review, the two options are:
Poison propagates only through the selected operand. e.g. “select i1 true, i32 0, i32 poison” is not poison.
Poison propagates if any operand is poison. e.g. “select i1 true, i32 0, i32 poison” is poison.
Each of the two options has advantages. This was discussed in depth a while ago, and we’d picked (1). I don’t want to reopen that discussion; I want to raise the question of whether we should pick “both”.
If we were to add a flag “npo” (for “no poison operand”, bikeshedding welcome!) to the select instruction, we could represent both options. This has a couple of advantages:
- We can lower all of the umin/etc… intrinsics to selects and avoid having to duplicate folds. This would reduce the combinatoric space for pattern matching optimizations. This would probably help optimization results in practice.
min/max intrinsics currently don’t have full support in InstCombine yet, because we haven’t ported all transforms from SPF min/max. Once that is done and we canonicalize to the intrinsic form, we can drop the duplicate SPF folds.
And how long has that work been in flight? Also, JFYI, the transform I hit was not a SPF specific fold. It was a generic fold on selects which had to be manually duplicated for the new umin form.
From an optimization quality perspective, we’re not ready to canonicalize to intrinsics yet. Unfortunately, SCEVExpander has already been switched to use intrinsics before this work completed, which is presumably how you got here.
It is. Which means your preceding argument doesn’t really apply unless you want to revert the SCEVExpander change.
I do want to avoid tying the idea and this discussion too closely the umin intrinsic handling. It’s what sparked the thought, but it might be worth doing even if we leave umin intrinsic as the canonical form form for that operation.
- We can restore many of the removed select->arithmetic folds (only for the npo selects).
Which transforms do you have in mind here? The select → and/or transform is the only one I’m aware of, but in that case we’re using select specifically for it’s current poison semantics, to prevent that fold from happening where it is not legal.
“Not legal” given the semantics we chose (e.g. 1 above). That’s a key point.
I’d have to ask Juneyoung, but I’d had the impression he had to disable a couple others as well.
Yeah, as far as I remember the only transformation that had to be fixed was transforming ‘select i1 %a, i1 %b, i1 false’ into ‘and i1 %a, %b’ and ‘select i1 %a, i1 true, i1 %b’ into ‘or i1 %a, %b’.
The transformation is finally removed, with updates in different optimizations to recognize the select form of and/or as well.
They are updated to use m_LogicalAnd/m_LogicalOr matchers instead.
Also, even if not, “just” the and/or transform seems useful enough to possibly justify the IR extension if we could infer npo frequently. (Which, admittedly, is an unknown.)
- We can infer npo flag on a select in many cases. (e.g. “select i1 %c, i32 0, i32 57” can be trivially converted to “select npo i1 %c, i32 0, i32 57”) This also allows us to factor logic into testable pieces whereas current transforms which are npo dependent must include the no-poison inference.
The major downside is that transformation code would have to be careful to only apply transforms dependent on “no poison operand” if the flag is set.
Anyways, I don’t have time to actually work on this, so I’m mostly throwing out the idea in case anyone with time and motivation finds it interesting to pursue.
With the above notes in mind, I’m not sure where having the flag would really be advantageous.
That’s why we debate these things.
Limited to umin/umax, I think the benefit is not obvious; umin is (in the absence of undef) equivalent to ‘select (icmp ult %x, %y), %x, %y’, so the poison is propagated in an equivalent manner.
That being said, going one step further and marking that a value is neither undef nor poison would be of great help for everything.
If some value is never undef & poison, all complexities regarding the correctness of optimizations disappear.
Clang already has enable-noundef-analysis flag, which attaches noundef in all valid cases in C/C++.
D82317 is a patch that activates it by default, and I promised to take it & passed it to a GSoC student; it is being rebased and the updated version will be at D105169.