Question about canonicalizing cmp+select

Hi, Sanjay/all,

I noticed in rL331486 that some compare-select optimizations are disabled in favor of providing canonicalized cmp+select to the backend.

I am currently working on a private backend target, and the target has a small code size limit. With this change, some of the apps went over the codesize limit. As an example,

C code:
b = (a > -1) ? 4 : 5;

ll code:
Before rL331486:

%0 = lshr i16 %a.0.a.0., 15
%1 = or i16 %0, 4

After rL331486:

%cmp = icmp sgt i16 %a.0.a.0., -1
%cond = select i1 %cmp, i16 4, i16 5

With the various encoding restrictions of my particular target, the cmp/select generated slightly larger code size. However, because the apps were very close to the code size limit, this slight change pushed them over the limit.

If I still prefer to have this optimization performed, then is my best strategy moving forward to implement this optimization as a peephole opt in my backend?

Thanks,
–Yuan

Hi, Sanjay/all,

  I noticed in rL331486 that some compare-select optimizations are disabled
in favor of providing canonicalized cmp+select to the backend.

  I am currently working on a private backend target, and the target has a
small code size limit. With this change, some of the apps went over the
codesize limit. As an example,

C code:
  b = (a > -1) ? 4 : 5;

ll code:
Before rL331486:
  %0 = lshr i16 %a.0.a.0., 15
  %1 = or i16 %0, 4

After rL331486:
  %cmp = icmp sgt i16 %a.0.a.0., -1
  %cond = select i1 %cmp, i16 4, i16 5

  With the various encoding restrictions of my particular target, the
cmp/select generated slightly larger code size. However, because the apps
were very close to the code size limit, this slight change pushed them over
the limit.

  If I still prefer to have this optimization performed, then is my best
strategy moving forward to implement this optimization as a peephole opt in
my backend?

I personally think it should probably be a DAGCombine transform,
driven by a Target Transform Info hook (e.g. like hasAndNot()).

Thanks,
--Yuan

Roman.

Do you run DAGCombiner? And are you overriding TLI.convertSelectOfConstantsToMath(VT) for your target?

For the stated example (true val and false val constants in the select differ by 1), that should already be converted.

[adding back llvm-dev and cc’ing Craig]

I think you are asking if we are missing a fold (or your target is missing enabling another hook) to transform the sext+add into shift+or? I think the answer is ‘yes’. We probably should add that fold. This seems like a similar case as the recent: https://reviews.llvm.org/D48466

Note that on x86, the sext+add becomes zext+sub:
t20: i8 = setcc t3, Constant:i16<-1>, setgt:ch
t24: i16 = zero_extend t20
t17: i16 = sub Constant:i16<5>, t24

Would that transform help your target?

I linked the wrong patch review. Here’s the patch that was actually committed:
https://reviews.llvm.org/D48508

While its true select convertSelectOfCosntantsToMath() creates add+sext i1, there’s a transform in DAGCombiner::visitADDLike that does “add (sext i1), X → sub X, (zext i1)”. Why didn’t that fire for your target?

Hi,

While its true select convertSelectOfCosntantsToMath() creates add+sext i1, there’s a transform in DAGCombiner::visitADDLike that does “add (sext i1), X → sub X, (zext i1)”. Why didn’t that fire for your target?

That’s a good question! We never explicitly marked sext-on-i1 as a custom op. And I think the default behavior is “legal” (Am I right on that?) So the following transform never gets executed because it thinks sext-on-i1 as legal op.

// add (sext i1), X → sub X, (zext i1)

if (N0.getOpcode() == ISD::SIGN_EXTEND &&
N0.getOperand(0).getValueType() == MVT::i1 &&
!TLI.isOperationLegal(ISD::SIGN_EXTEND, MVT::i1)) {
SDValue ZExt = DAG.getNode(ISD::ZERO_EXTEND, DL, VT, N0.getOperand(0));
return DAG.getNode(ISD::SUB, DL, VT, N1, ZExt);
}

I will try to change the isel behavior sext-on-i1, and see if that fixes the issue.

Yuan -

I’ve added the following DAG transforms that try to improve codegen for cmp+select patterns:

https://reviews.llvm.org/rL337130
https://reviews.llvm.org/rL338132
https://reviews.llvm.org/rL338317

Please let me know if that helped your target and/or if there are other patterns that we need to improve.