[RFC] Integer Intrinsics for abs, in unsigned/signed min/max

Hello all.

This is a proposal to introduce 5 new integer intrinsics:
* absolute value
* signed min
* signed max
* unsigned min
* unsigned max

This is motivated by the fact that we keep working around
not having these intrinsics, and that constantly leads us into
having more workarounds, and causes infinite combine loops.

Here's a (likely incomplete!) list of motivational bugs:

infinite loops:
46271 – InstCombine infinite loop / ⚙ D81698 [InstCombine] allow undef elements when comparing vector constants for min/max bailout
45539 – Instcombine enters an infinite loop when optimizing IR /
rG01bcc3e93714
44835 – [InstCombine] Infinite loop in min/max load/store combine / ⚙ D74278 [InstCombine] Fix infinite loop in min/max load/store bitcast combine (PR44835)
⚙ D68408 [InstCombine] Negator - sink sinkable negations
⚙ D59378 [InstCombine] Prevent icmp transform that can cause inf loop if part of min/max
38915 – [InstCombine] Infinite loop by operating on same set of instructions in worklist / ⚙ D51964 [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible.
37526 – [InstCombine] MinMax patterns produce an infinite loop within InstCombine. / rG7c9ad0db3dc8

misc:
https://bugs.llvm.org/show_bug.cgi?id=44025
43310 – Failed to combine -smax(-x,-y) to smin(x,y) / rGeb8d39e11315
35607 – Missed optimization in math expression: max(min(a,b),max(a,b)) == max(a,b)
35642 – recognize min/max patterns as commutative / https://reviews.llvm.org/D41136
41083 – Assertion in ScopedHashTable during EarlyCSE / ⚙ D74285 [EarlyCSE] avoid crashing when detecting min/max/abs patterns (PR41083)
⚙ D70148 [SLP] fix miscompile on min/max reductions with extra uses (PR43948)
31751 – Compilation not completing for gnomon_engine.cpp of BlastC++ /
⚙ D26096 [ValueTracking] recognize obfuscated variants of umin/umax / rGfebcb9ce54e1

I believe we can do better than that if we stop just treating some IR patterns
as being canonical and desperately trying not to break/loose track of them,
but instead do a sensible thing and actually make them first class citizens,
by introducing intrinsics and use then throughout.

This has been previously discussed in:
https://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html

Proposed LangRef semantics: https://reviews.llvm.org/D81829
Proposed alive2 implementation: {u,s}{min,max}, abs intrinsics modelling by LebedevRI · Pull Request #353 · AliveToolkit/alive2 · GitHub

Roman.

Thanks for putting this together! I am strongly in favor of this proposal.

My informal proposal in the llvm-dev link from 2016 was at least the 2nd time this has come up. On each of the previous attempts, we decided that the cost of analysis for what is usually a 2-instruction icmp+select sequence was low enough that min/max intrinsics were not worth their weight.

But that has proven wrong over time - the corner-cases change with each fix, so we hit a new min/max infinite loop or missed optimization seemingly every month or so. As noted, the list of problems shown here is only a small sampling of the total.

We’ve shown that we can adapt IR analysis to use intrinsics for things like overflowing/saturating math, and min/max/abs should be about the same level of work. SelectionDAG already has equivalent nodes for these ops, so connecting those with IR intrinsics is trivial.

I’m also generally in favor of adding these as target independent intrinsics. One question though - how are these “reductions”? Why not use llvm.umax… instead of llvm.reduce.umax?

-Chris

I'll note that I was one of the folks previously skeptical of this idea. I've been following activity on this in the meantime, and while I'm not 100% convinced this is the right direction, I'm also nowhere near as sure as I was that it isn't. :slight_smile:

So, not quite a +1 from me, but not a -1 either.

I think it's completely reasonable to try this approach. At worst, we decide it doesn't work either and simply canonicalize the new intrinsics to the existing IR patterns. :slight_smile:

Philip

I’m also strongly in favor of this proposal.

Next to the issues already mentioned, this also fixes issues related to undef handling. For example, umax(%x, C) is not actually guaranteed to be >= C. That’s because the current umax representation has two uses of %x, which may take on independent values if %x is undef. This makes a number of “common sense” folds invalid. Having dedicated min/max intrinsics avoids that problem.

Regards,

Nikita

As per popular demand i've dropped misleading "reduction"
wording/naming from them, updated https://reviews.llvm.org/D81829

So far all the responses are favorable to this proposal.

Roman.

Cool, thanks for driving this Roman. I’d recommend splitting up the langref patch and landing each intrinsic along with its implementation. We’ll need verifier support, ISel legalization support (for targets that don’t implement it) etc. Adoption by targets doesn’t seem like a requirement of the first patch.

-Chris