[SelectionDAG] Wrong type for shift libcalls when expanding funnelled shifts

Hey all,

I've recently encountered a problem after upgrading the LLVM version
of our downstream,
and it seems to reproduce in other backends as well. It's a bit
esoteric, as it's rare for
shift libcalls to be emitted at all, but I've reproduced it on top of
latest trunk and can
demonstrate the subtle codegen issue on e.g. RISCV32 (using minsize to
induce libcall
instead of expand):

// fshr.ll
declare i64 @llvm.fshr.i64(i64, i64, i64)

define i64 @foo(i64 %__val, i64 %__shift) minsize {
  %cond = tail call i64 @llvm.fshr.i64(i64 %__val, i64 %__val, i64 %__shift)
  ret i64 %cond
}

RUN:
llc --mtriple=riscv32 fshr.ll -o -

OUTPUT:
[snipped]
mv s0, a2
mv s2, a1
mv s1, a0
andi a2, a2, 63
mv a3, zero ; This is redundant
call __lshrdi3@plt
mv s3, a0
mv s4, a1
neg a0, s0
andi a2, a0, 63
mv a0, s1
mv a1, s2
mv a3, zero ; This is redundant
call __ashldi3@plt
or a0, s3, a0
or a1, s4, a1
[snipped]

The problem is that TargetLowering::makeLibCall is called with
Ops[1].getValueType() == MVT::i64
from DAGTypeLegalizer::ExpandIntRes_Shift, which is incorrect because
shift libcalls have MVT::i32
for their second argument (. In this particular case, this leads to a
harmless-but-redundant zeroing
of a register, but in the general case it's an ABI mismatch.

The path that the SelectionDAG node undergoes is:
1. SelectionDAGBuilder::visitIntrinsicCall, Intrinsic::fshr, X == Y =>
create ISD::ROTR.
2. DAGTypeLegalizer::ExpandIntRes_Rotate
3. TargetLowering::expandROT => No support for rotate, create shifts
(but with ShVT = Op1.getValueType(), so MVT::i64)
4. DAGTypeLegalizer::ExpandIntRes_Shift => Libcallize => called with {
N->getOperand(0), N->getOperand(1) } as operands, but the second
operand is MVT::i64.

When I traced the path that regular shifts undergo, they start out
life already with shift amount type according to TLI
getShiftAmountType() + some logic to extend/truncate, see
SelectionDAGBuilder::visitShift.

I think a correct fix would be to change ExpandIntRes_Shift's
libcallization flow to ZExtOrTrunc (or potentially AnyExtOrTrunc) the
second argument to MVT::i32,
but I don't have enough perspective to be confident that that's a
correct and sufficient fix.

Appreciate your input!

Could it perhaps be two bugs here?

After legalization the shift amount type should match TLI.getShiftAmountTy()
for SHL/SRL/ROT etc.

So if DAGTypeLegalizer::ExpandIntRes_Rotate is emitting SHL/SRL nodes that
has a different type for the shift amount that sounds a bit weird to me.
Or is that supposed to be legalized after having expanded the rotate result?

Nevertheless, DAGTypeLegalizer::ExpandIntRes_Shift can't assume that
the shift amount operand already has a been legalized, so it must
handle that the type isn't matching TLI.getShiftAmountTy().
Although, I don't know if TLI.getShiftAmountTy() necessarily need to
match with "si_int" that the LIBC function is expecting. For the libcall
to be correct (ABI wise), then I think the shift amount should be converted
to something that matches with DAG.getLibInfo().getIntSize().

/Björn

At the very least I’ve fixed my current patch to take DAG.getLibInfo().getIntSize() into account:
https://reviews.llvm.org/D110508

I’m not sure regarding the point about ExpandIntRes_Rotate (or rather, TargetLowering::expandROT);
I do see other methods of TargetLowering that use getShiftAmountTy() (e.g. expandCTPOP, expandCTLZ),
whereas expandROT simply uses Op1.getValueType().

Itay