should we do this time-consuming transform in InstCombine?

Hi,

There is an opportunity in instCombine for following instruction pattern:

%mul = mul nsw i32 %b, %a
%cmp = icmp sgt i32 %mul, -1
%sub = sub i32 0, %a
%mul2 = mul nsw i32 %sub, %b
%cond = select i1 %cmp, i32 %mul, i32 %mul2

Source code for above pattern:
return (ab) >=0 ? (ab) : -a*b;

Currently, llvm(-O3) can not recognize this as abs(a*b).

I initially think we could do this in instCombine phase in opt. Below is what I think:

%res = OP i32 %b, %a
%sub = sub i32 0, %b
%res2 = OP i32 %sub, %a

We could do the transform:
%res2 = OP i32 %sub, %a ==> %res2 = sub i32 0, %res

Then we can get the advantage:
1: if %res2 is the only user of %sub, %sub can be eliminated;
2: if %res2 is not the only user of %sub, we could change some heavy instruction like div to sub;
3: expose more abs opportunity for later pass.

But my concern is finding %res is a little compiling time-consuming.
At least we need MIN(user_count(%b), user_count(%a)) times to check if instruction with same opcode and same operands exists.

Could you guys give some comment? Is there any better idea for this transform?

Thanks.

BRS//
Chen Zheng
Power Compiler Backend Developer

Hi,

Hi.

There is an opportunity in instCombine for following instruction pattern:

%mul = mul nsw i32 %b, %a
%cmp = icmp sgt i32 %mul, -1
%sub = sub i32 0, %a
%mul2 = mul nsw i32 %sub, %b
%cond = select i1 %cmp, i32 %mul, i32 %mul2

Source code for above pattern:
return (a*b) >=0 ? (a*b) : -a*b;

Currently, llvm(-O3) can not recognize this as abs(a*b).

I initially think we could do this in instCombine phase in opt. Below is what I think:

%res = OP i32 %b, %a
%sub = sub i32 0, %b
%res2 = OP i32 %sub, %a

We could do the transform:
%res2 = OP i32 %sub, %a ==> %res2 = sub i32 0, %res

Then we can get the advantage:
1: if %res2 is the only user of %sub, %sub can be eliminated;
2: if %res2 is not the only user of %sub, we could change some heavy instruction like div to sub;
3: expose more abs opportunity for later pass.

But my concern is finding %res is a little compiling time-consuming.
At least we need MIN(user_count(%b), user_count(%a)) times to check if instruction with same opcode and same operands exists.

In instcombine, no user checking is performed/allowed.
This should match that *specific* pattern (other than verifying the
correct equal binop types), although i have not tested it:

  ICmpInst::Predicate Pred;
  Value *A, *B, *Mul, *Sub, *Mul2;
  if (match(&SI,
            m_Select(m_ICmp(Pred,
                            m_CombineAnd(m_BinOp(m_Value(A), m_Value(B)),
                                         m_Value(Mul)),
                            m_AllOnes()),
                     m_Deferred(Mul),
                     m_CombineAnd(
                         m_c_BinOp(m_CombineAnd(m_Sub(m_Zero(), m_Deferred(A)),
                                                m_Value(Sub)),
                                   m_Deferred(B)),
                         m_Value(Mul2)))) &&
      Pred == ICmpInst::Predicate::ICMP_SGT) {
  }

Could you guys give some comment? Is there any better idea for this transform?

Thanks.

BRS//
Chen Zheng
Power Compiler Backend Developer

Roman.

Hi Roman,

Thanks for your good idea. I think it can solve the abs issue very well. I can continue with my work now^-^.

But if it is not abs and there is no select,
%res = OP i32 %b, %a
%sub = sub i32 0, %b
%res2 = OP i32 %sub, %a

theoretically, we can still do the following transform for the above pattern:
%res2 = OP i32 %sub, %a ==> %res2 = sub i32 0, %res

Not sure whether we can do it in instCombine.

Thanks.

BRS//
Chen Zheng
Power Compiler Backend Developer

graycol.gif

Yes, it looks like we are missing an IR canonicalization for ‘mul’ like this:
https://rise4fun.com/Alive/ARs

%neg = sub i8 0, %x
%r = mul i8 %neg, %y
=>
%mul2 = mul i8 %x, %y
%r = sub i8 0, %mul2

What other opcodes do you see where this transform applies?

If adding that transform is enough to solve the original problem, then we don’t need to add the larger pattern match that includes the select?

graycol.gif

Hi Sanjay,

for the original abs issue, I tested following pattern:

return (a+b) >= 0 ? (a+b) : -(a+b); //works now, can be recognized as abs
return (a+b) >= 0 ? (a+b) : -a-b; //works now

return (a-b) >= 0 ? (a-b) : -(a-b); //works now
return (a-b) >= 0 ? (a-b) : -a+b; //works now

return (ab) >= 0 ? (ab) : -(ab); //works now
__return (a-b) >= 0 ? (a
b) : -a*b; //doesn’t work__

return (a/b) >= 0 ? (a/b) : -(a/b); //works now
return (a/b) >= 0 ? (a/b) : -a/b; //doesn’t work

return (a%b) >= 0 ? (a%b) : -a%b; //doesn’t work

I think I can start from MUL.

Thanks.

BRS//
Chen Zheng
Power Compiler Backend Developer

graycol.gifSanjay Patel —2018/12/19 01:16:24 AM—Yes, it looks like we are missing an IR canonicalization for ‘mul’ like this:

Be careful with sdiv/srem - the transform is not valid on the minval edge cases:
https://rise4fun.com/Alive/cWl

graycol.gif