# IR canonicalization: select or bool math?

I have another round of questions about IR select canonicalizations. For the purity of this quiz, please disregard prior knowledge of how this is handled by instcombine or how this is lowered by your favorite target…of course we’ll fix it. Some answers in the links below if you do want to know.

Which, if any, of these is canonical?

1. Is a zext simpler than a select?

a. define i32 @sel_1_or_0(i1 %a) {
%b = select i1 %a, i32 1, i32 0
ret i32 %b
}

b. define i32 @sel_1_or_0(i1 %a) {

%b = zext i1 %a to i32
ret i32 %b
}

1. What if we have to ‘not’ the bool?

a. define i32 @sel_0_or_1(i1 %a) {
%b = select i1 %a, i32 0, i32 1
ret i32 %b
}

b. define i32 @sel_0_or_1(i1 %a) {
%not.a = xor i1 %a, true
%b = zext i1 %not.a to i32
ret i32 %b
}

1. Is sext handled differently?

a. define i32 @sel_-1_or_0(i1 %a) {
%b = select i1 %a, i32 -1, i32 0
ret i32 %b
}

b. define i32 @sel_-1_or_0(i1 %a) {
%b = sext i1 %a to i32
ret i32 %b
}

1. What if the sext needs a ‘not’?
a. define i32 @sel_0_or_-1(i1 %a) {
%b = select i1 %a, i32 0, i32 -1
ret i32 %b
}

b. define i32 @sel_0_or_-1(i1 %a) {
%not.a = xor i1 %a, true
%b = sext i1 %not.a to i32
ret i32 %b
}

1. What if both constants are non-zero? Ie, the implicit add/sub of the earlier cases can’t be eliminated.

a. define i32 @sel_2_or_1(i1 %a) {
%b = select i1 %a, i32 2, i32 1
ret i32 %b
}

b. define i32 @sel_2_or_1(i1 %a) {
%b = zext i1 to i32 %a

%c = add i32 %b, 1

ret i32 %b
}

1. Does ‘sub’ make a difference?

a. define i32 @sel_1_or_2(i1 %a) {
%b = select i1 %a, i32 1, i32 2
ret i32 %b
}

b. define i32 @sel_1_or_2(i1 %a) {
%b = zext i1 %a to i32

%c = sub i32 2, %b

ret i32 %c

}

1. Choose between integers that are not consecutive?

a. define i32 @sel_0_or_2(i1 %a) {

%sel = select i1 %a, i32 2, i32 0

ret i32 %sel2
}

b. define i32 @sel_0_or_2(i1 %a) {

%zexta = zext i1 %a to i32

%add = add i32 %zexta, %zexta

ret i32 %add
}

1. Choose {0,1,2} based on 2 bools?

a. define i32 @sel_sel(i1 %a, i1 %b) {

%zexta = zext i1 %a to i32

%sel1 = select i1 %a, i32 2, i32 1

%sel2 = select i1 %b, i32 %sel1, %zexta

ret i32 %sel2
}

b. define i32 @sel_sel(i1 %a, i1 %b) {

%zexta = zext i1 %a to i32

%zextb = zext i1 %b to i32

%add = add i32 %zexta, %zextb

ret i32 %add
}

``````
``````

From: "Sanjay Patel" <spatel@rotateright.com>
To: "llvm-dev" <llvm-dev@lists.llvm.org>
Cc: "Eli Friedman" <efriedma@codeaurora.org>, "David Majnemer"
<david.majnemer@gmail.com>, "Michael Kuperstein"
<mkuper@google.com>, "Philip Reames" <listmail@philipreames.com>,
"Hal Finkel" <hfinkel@anl.gov>
Sent: Wednesday, September 28, 2016 5:39:38 PM
Subject: IR canonicalization: select or bool math?

I have another round of questions about IR select canonicalizations.
For the purity of this quiz, please disregard prior knowledge of how
this is handled by instcombine or how this is lowered by your
favorite target...of course we'll fix it. Some answers in the
links below if you do want to know.

Which, if any, of these is canonical?

I think that we should prefer the single-IR-instruction forms as canonical. zext/sext, when it is all that is needed, and the select otherwise. I suspect that will be better for analysis and follow-on optimizations, at least most of the time. For lowering, OTOH, I'd prefer the forms with zext/sext combined with xor/add/sub.

-Hal

Thanks for sharing this! Quite interesting questions indeed

I regret that we don’t have any documentation (on the same level as LangRef) that would record and describe the canonicalization of the IR and the motivation for each case.

Right. That’s the general (abstract, and probably not uniformly followed) goal: canonicalize towards the conceptually most strength reduced operation. zext is simpler than select, so it should be more canonical when it applies.

-Chris

My gut tells me that Hal is right, and we should prefer zexts as long
as the select boils down to one instruction, but let me go against my
intuition and try to list two reasons why we should prefer selects:

* Folding operations into selects: it is trivial to transform
f(select X, Const0, Const1) to select X, f(Const0), f(Const1),
while doing that can be difficult for zexts.

define i32 @sel_1_or_0(i1 %a) {
%b = select i1 %a, i32 1, i32 0
%k = add i32 %b, 50
ret i32 %k
}

==>

define i32 @sel_1_or_0(i1 %a) {
%b = select i1 %a, i32 51, i32 50
ret i32 %b
}

but

define i32 @sel_1_or_0(i1 %a) {
%b = zext i1 %a to i32
%k = add i32 %b, 50
ret i32 %k
}

is not as natural to transform.

* zexts (resp. sexts) lose information when combined with nuw
(resp. nsw) operations.

That is, if we inline

define i32 @sel_1_or_0(i1 %a) {
%b = zext i1 %a to i32
ret i32 %b
}

into f and get

define i32 @f(i32 %x, i32 %y) {
%x.zext = ...
%y.zext = ...
;; There are some existing use of %x.zext and %y.zext
%a = add nuw i1 %x, %y
%b = zext i1 %a to i32
ret i32 %b
}

then we'll get (following your instructions, I've not verified that
this actually happens )

define i32 @f(i1 %x, i1 %y) {
%x.zext = ...
%y.zext = ...
%a = add nuw i32 %x.zext, %y.zext
ret i32 %a
}

but we've now lost information -- we no longer know that %a is 0 or
1 -- it can also be 2. This would not happen if we always
canonicalized zext i1 C to iN to select C, iN 1, iN 0

-- Sanjoy

IMO, canonicalization should aim at making it easy for the rest of the compiler to reason about the code. While select may not appear the simplest in terms of code complexity, it can lead to a simpler code when combined with other operations (as per Sanjoy's example), while at the same time it can always become a zext later on (if it's not combined with anything).

-Krzysztof

I hadn’t thought about the no-wrap consequences. That does seem like a good argument to always favor select since propagating nsw/nuw correctly is a weak point as I understand it. It would also make the instcombine behavior more consistent despite the fact that, as Chris points out, zext/sext are acknowledged as simpler ops.

Sadly (for me anyway since I want to clean this up!) this means, as Hal noted: in general we want to canonicalize toward select in IR, and canonicalize toward bool math in the DAG. I don’t know of any target that does better lowering the select variants. In some cases today, we even create control flow where there was none in the original code when we transform bool math into select in IR. Prejudiced by that knowledge, I’ve been enhancing IR transforms that effectively assume that 2 cheap IR ops (ext+add/sub) are more canonical than 1 select.

So a proposal to move this forward:

1. Add DAG combines (possibly with TLI hooks) to turn selects into bool math. This will be a translation of existing IR select instcombines + new transforms. We can’t knowingly create IR patterns that the backend is not prepped to handle.

2. Remove existing IR instcombines that remove selects in favor of bool math.

3. Add IR instcombines to transform bool math into selects.

Here are the quiz answers based on current trunk - if you did actually try to compile the code, sorry about all of the typos that made that impossible! :

#1 - 4 are canonicalized to bool math. Ie, we don’t care about the relative cost of sext vs. zext or if the transform needs an extra instruction (the ‘not’) for these cases.

#5 is canonicalized to select.

#6 is not canonicalized either way; 5/6 were introduced together as you might expect, but 6 was backed out due to lousy codegen!

#7 is not canonicalized either way; too rare to care about?

#8 is not canonicalized either way; this one is fun because it exists in the instcombine source code itself - and we do a lousy job with it.

Hi Sanjay,

Hi Hal,

Question about:

… and canonicalize toward bool math in the DAG. I don’t know of any target that does better lowering the select variants.

What about recognition of idioms that may be considered to be naturally composed of ‘select’ operations such as: min/max/saturate/clamp. Will this change impact the target’s ability (or make it more difficult) to recognize these idioms?

Thanks, Zvi

I don’t think that the patterns that I’m currently looking at will overlap with min/max and friends.

So far we’re proposing to increase selects in instcombine, and we go out of our way to avoid breaking icmp+select pairs (which seems questionable…):

// Test if the ICmpInst instruction is used exclusively by a select as
// part of a minimum or maximum operation. If so, refrain from doing
// any other folding. This helps out other analyses which understand
// non-obfuscated minimum and maximum idioms, such as ScalarEvolution
// and CodeGen.

Once you get to the DAG, there’s more support for min/max at least because we have various generic and target-specific nodes for those. But again, if we pessimize some codegen with a new fold, it should be considered a bug.

Note that select lowering in the DAG doesn’t work ideally right now, so I think we need to fix that before producing more selects in IR:

https://llvm.org/bugs/show_bug.cgi?id=28925