instcombine can handle certain orders of "icmp"s that are "or"ed together:
x != 5 OR x > 10 OR x == 8 becomes..
x != 5 OR x == 8 becomes..
x != 5
However, a different ordering prevents the simplification:
x == 8 OR x > 10 OR x != 5 becomes..
%or.eq8.gt10 OR x != 5
and that can't be simplified because we now have an "or" OR "icmp".
What would I need to implement to restore the commutative property?
Perhaps a first stab would be to take (A|B)|C create two binaryOp A|C
and B|C and recursively call visitOr on each of them to see if they
simplify. Using the example above..
Before:
%or.eq8.gt10 = .. ; [uses=1]
%res = or %or.eq8.gt10, %ne5 ; original instruction
After:
%or.eq8.gt10 = .. ; [uses=0]
%or.eq8.ne5 = %ne5 ; instcombine recursively simplified this [uses=1
or 0 see next]
%res = or %or.eq8.ne5, %gt10 ; even better: %res = or %ne5, %gt10
The recursive call of A|C would also let C see further "or"s, so if A
is actually composed of (a1|a2), it could potentially try to simplify
a1|C and a2|C.
I'm not entirely sure right now, but potentially there could be issues
of creating too many additional "or" instructions. E.g., %or.eq.8.gt10
originally had more than 1 use.
Am I on the right track (or does LLVM already support this in another
optimization pass?)
Ed
Here's an initial stab, but I'm not too happy about the temporarily
adding new instructions then removing it because returning it will
have it added back in to replace other uses. I also added a couple
test cases pass with the new InstructionCombining changes (the old
code only passes one of the added tests).
Also, this change exposes some simplification for
test/Transforms/InstCombine/2006-05-06-Infloop.ll. The original code
truncates two i32 values to i8 and eventually ORs them together. The
patch has it optimized to OR the two values first then truncate a
single value.
%tmp264not = xor i8 %tmp264, -1 ; <i8> [#uses=1]
%tmp212 = trunc i32 %iftmp.36.0 to i8 ; <i8> [#uses=2]
%tmp265 = trunc i32 %tmp261 to i8 ; <i8> [#uses=1]
%tmp266 = or i8 %tmp212, %tmp264not ; <i8> [#uses=2]
%tmp268 = or i8 %tmp266, %tmp265 ; <i8> [#uses=1]
%tmp264not = xor i8 %tmp264, -1 ; <i8> [#uses=2]
%a.c45 = or i32 %iftmp.36.0, %tmp261 ; <i32> [#uses=1]
trunc i32 %a.c45 to i8 ; <i8>:0 [#uses=1]
%tmp268 = or i8 %0, %tmp264not ; <i8> [#uses=1]
Ed
or.commute.patch (3.02 KB)
Oh. Seems like you can get something similar by running -instcombine
twice but stick in a -reassociate in between the two. 
Hmm.. I wonder how much faster/smaller would instcombine get without
any associative/commutative code compared to running it twice with
-reassociate.
Ed
Edward Lee wrote:
Here's an initial stab, but I'm not too happy about the temporarily
adding new instructions then removing it because returning it will
have it added back in to replace other uses. I also added a couple
test cases pass with the new InstructionCombining changes (the old
code only passes one of the added tests).
Hi Edward. I was wondering whether your optimization could be implemented using the AssociativeOpt framework already in InstCombine? It has a built-in mini-reassociate just for this occasion.
Nick