instcombine can handle certain orders of "icmp"s that are "or"ed together:

x != 5 OR x > 10 OR x == 8 becomes..

x != 5 OR x == 8 becomes..

x != 5

However, a different ordering prevents the simplification:

x == 8 OR x > 10 OR x != 5 becomes..

%or.eq8.gt10 OR x != 5

and that can't be simplified because we now have an "or" OR "icmp".

What would I need to implement to restore the commutative property?

Perhaps a first stab would be to take (A|B)|C create two binaryOp A|C

and B|C and recursively call visitOr on each of them to see if they

simplify. Using the example above..

Before:

%or.eq8.gt10 = .. ; [uses=1]

%res = or %or.eq8.gt10, %ne5 ; original instruction

After:

%or.eq8.gt10 = .. ; [uses=0]

%or.eq8.ne5 = %ne5 ; instcombine recursively simplified this [uses=1

or 0 see next]

%res = or %or.eq8.ne5, %gt10 ; even better: %res = or %ne5, %gt10

The recursive call of A|C would also let C see further "or"s, so if A

is actually composed of (a1|a2), it could potentially try to simplify

a1|C and a2|C.

I'm not entirely sure right now, but potentially there could be issues

of creating too many additional "or" instructions. E.g., %or.eq.8.gt10

originally had more than 1 use.

Am I on the right track (or does LLVM already support this in another

optimization pass?)

Ed

Here's an initial stab, but I'm not too happy about the temporarily

adding new instructions then removing it because returning it will

have it added back in to replace other uses. I also added a couple

test cases pass with the new InstructionCombining changes (the old

code only passes one of the added tests).

Also, this change exposes some simplification for

test/Transforms/InstCombine/2006-05-06-Infloop.ll. The original code

truncates two i32 values to i8 and eventually ORs them together. The

patch has it optimized to OR the two values first then truncate a

single value.

%tmp264not = xor i8 %tmp264, -1 ; <i8> [#uses=1]

%tmp212 = trunc i32 %iftmp.36.0 to i8 ; <i8> [#uses=2]

%tmp265 = trunc i32 %tmp261 to i8 ; <i8> [#uses=1]

%tmp266 = or i8 %tmp212, %tmp264not ; <i8> [#uses=2]

%tmp268 = or i8 %tmp266, %tmp265 ; <i8> [#uses=1]

%tmp264not = xor i8 %tmp264, -1 ; <i8> [#uses=2]

%a.c45 = or i32 %iftmp.36.0, %tmp261 ; <i32> [#uses=1]

trunc i32 %a.c45 to i8 ; <i8>:0 [#uses=1]

%tmp268 = or i8 %0, %tmp264not ; <i8> [#uses=1]

Ed

or.commute.patch (3.02 KB)

Oh. Seems like you can get something similar by running -instcombine

twice but stick in a -reassociate in between the two.

Hmm.. I wonder how much faster/smaller would instcombine get without

any associative/commutative code compared to running it twice with

-reassociate.

Ed

Edward Lee wrote:

Here's an initial stab, but I'm not too happy about the temporarily

adding new instructions then removing it because returning it will

have it added back in to replace other uses. I also added a couple

test cases pass with the new InstructionCombining changes (the old

code only passes one of the added tests).

Hi Edward. I was wondering whether your optimization could be implemented using the AssociativeOpt framework already in InstCombine? It has a built-in mini-reassociate just for this occasion.

Nick