Help with promotion/custom handling of MUL i32 and MUL i64

I’ll try to run through the scenario:

64-bit register type target (all registers have 64 bits).

all 32-bits are getting promoted to 64-bit integers

Problem:

MUL on i32 is getting promoted to MUL on i64

MUL on i64 is getting expanded to a library call in compiler-rt

the problem is that MUL32 gets promoted and then converted into a subroutine call because it is now type i64, even though I want the MUL I32 to remain as an operation in the architecture. MUL i32 would generate a 64-bit results from the lower 32-bit portions of 64-bit source operands.

In customize for the operations, I am trying to do something like:

case ISD::MUL:
{
EVT OpVT = Op.getValueType();
if (OpVT == MVT::i64) {
RTLIB::Libcall LC = RTLIB::MUL_I64;
SDValue Dummy;
return ExpandLibCall(LC, Op, DAG, false, Dummy, *this);
}
else if (OpVT == MVT::i32){

??? What to do here to not have issues with type i32

}
}

I’ve gone a few directions on this.

Defining the architecture type i32 leads to a lot of changes that I don’t think is the most straightforward change.

Would think there is a way to promote the MUL i32 types but still be able to “see” that as a MUL i32 somewhere down the lowering process.

Are there suggestions on how to promote the type, but then be able to customize the original i64 to a call and the original mul i32 to an operation?

I'll try to run through the scenario:

64-bit register type target (all registers have 64 bits).

all 32-bits are getting promoted to 64-bit integers

Problem:

MUL on i32 is getting promoted to MUL on i64

MUL on i64 is getting expanded to a library call in compiler-rt

Can you fix this by marking i64 MUL as Legal?

the problem is that MUL32 gets promoted and then converted into a
subroutine call because it is now type i64, even though I want the MUL I32
to remain as an operation in the architecture. MUL i32 would generate a
64-bit results from the lower 32-bit portions of 64-bit source operands.

In customize for the operations, I am trying to do something like:

case ISD::MUL:
        {
         EVT OpVT = Op.getValueType();
          if (OpVT == MVT::i64) {
            RTLIB::Libcall LC = RTLIB::MUL_I64;
            SDValue Dummy;
            return ExpandLibCall(LC, Op, DAG, false, Dummy, *this);
          }
          else if (OpVT == MVT::i32){

            ??? What to do here to not have issues with type i32
          }
        }

I've gone a few directions on this.

Defining the architecture type i32 leads to a lot of changes that I don't
think is the most straightforward change.

When you say 'defining an architecture type' do you mean with
addRegisterClass() in your TargetLowering constructor? If so, then this
would be my recommendation. Can you elaborate more on what is
preventing you from doing this.

Would think there is a way to promote the MUL i32 types but still be able
to "see" that as a MUL i32 somewhere down the lowering process.

The R600 backend does something similar to this. It has 24-bit MUL and
MAD instructions and selects these by looking at an i32 integer and
trying to infer whether or not it is really a 24-bit value.
See the SelectI24 and SelectU24 functions in AMDGPUISelDAGToDAG.cpp.

-Tom

Thanks for the information, allow maybe I can re-phrase the question or issue.

Assume 64-bit register types, but integer is 32-bit. Already have table generation of the 64-bit operation descriptions.

How about this modified approach?

Before type-legalization, I’d really like to move all MUL I64 to a subroutine call of my own choice.

This would be a form of customization, but I want this to happen before type legalization. Right now, type legalization, promotes all MUL I32 to 64-bit, and I lose the ability to differentiate between what originally
was a MUL on 64-bit and 32-bit values.

Only thing that I have seen happen at DAG Selection is for lowering custom intrinsic functions like memcpy:

./Target/X86/X86SelectionDAGInfo.cpp:178:X86SelectionDAGInfo::EmitTargetCodeForMemcpy(SelectionDAG &DAG,

Is there a general SelectionDAG conversion that can be made to happen before all type promotions?

Again, even modifications in ISelDAGToDAG.cpp will be after type promotion in my understanding.

Hi Dan,

If you set the node's action to "Custom", you should be able to
interfere in the type legalisation phase (before it gets promoted to a
64-bit MUL) by overriding the "ReplaceNodeResults" function.

You could either expand it to a different libcall directly there, or
replace it with a target-specific node (say XXXISD::MUL32) which
claims to take i64 types but you really know is the 32-bit multiply.
Then you'd have to take care of that node elsewhere, of course.

Cheers.

Tim.

Thanks Tom. I really appreciate your insight.

I’m able to use the customize to get the 64-bit to go to a subroutine and for the 32-bit, I am generate XXXISD::MUL32. I’m not sure then what you mean about “overriding” the ReplaceNodeResults.

For ReplaceNodeResults, I’m doing:

SDValue Res = LowerOperation(SDValue(N, 0), DAG);

for (unsigned I = 0, E = Res->getNumValues(); I != E; ++I)
Results.push_back(Res.getValue(I));

I did have to put in the following as well:

SDValue LHS = Op.getOperand(0);
SDValue RHS = Op.getOperand(1);
LHS = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i64, LHS);
RHS = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i64, RHS);
return DAG.getNode(XXXISD::MUL32, Op->getDebugLoc(), MVT::i64, LHS, RHS);

In order to get the operation to be able to be able to go forward and match the new operation with the input operands (which were still I32 and not yet type-legalized to i64). Does this make sense to you?

Here’s what I am using to generate the XXXISD::MUL32:

if(OpVT != MVT::i64) {
//Op.getNode()->dumpr();

SDValue LHS = Op.getOperand(0);
SDValue RHS = Op.getOperand(1);
LHS = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i64, LHS);
RHS = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i64, RHS);

return DAG.getNode(XXXISD::MUL32, Op->getDebugLoc(), MVT::i64, LHS, RHS);
}

Not sure if the above is correct?

It is then running into a problem with the next ADD instruction not being able to PromoteIntRes_SimpleIntBinOp, it tries to check the XXXISD::MUL32

(gdb) p N->dumpr()
0x23ad0d0: i32 = add [ID=0] 0x23aff60, 0x23acfd0
0x23aff60: i64 = <<Unknown Node #192>> [ID=-3] 0x23b0260, 0x23b0160
0x23b0260: i64 = and [ID=-3] 0x23af660, 0x23b0060: i64 = Constant<4294967295> [ID=-3]
0x23af660: i64,ch = load<LD4[@i], anyext from i32> [ID=-3] 0x238b068: ch = EntryToken [ID=-3], 0x23ac7d0: i64 = GlobalAddr
ess<i32* @i> 0 [ID=-3], 0x23ac9d0: i64 = undef [ID=-3]
0x23b0160: i64 = and [ID=-3] 0x23afc60, 0x23b0060: i64 = Constant<4294967295> [ID=-3]
0x23afc60: i64,ch = load<LD4[@j], anyext from i32> [ID=-3] 0x238b068: ch = EntryToken [ID=-3], 0x23acbd0: i64 = GlobalAddr
ess<i32* @j> 0 [ID=-3], 0x23ac9d0: i64 = undef [ID=-3]
0x23acfd0: i32,ch = load<LD4[@k]> [ID=-3] 0x238b068: ch = EntryToken [ID=-3], 0x23aced0: i64 = GlobalAddress<i32* @k> 0 [ID=-3
], 0x23ac9d0: i64 = undef [ID=-3]

When you say that I’ll have to take care of the node elsewhere, does that mean in defining it as a proper way to lower? Like below? I found that if I don’t then put the XXXISD::MUL32 in the LowerOperation, then after it is created doing the custom change of MUL, that it just dies not knowing how to lower the machine op. I would have thought that there was a default path for any XXXISD operation? And I didn’t see other Targets generating their machine ops

SDValue XXXTargetLowering::
LowerOperation(SDValue Op, SelectionDAG &DAG) const {

case XXXISD::MUL32:
return SDValue();

Really appreciate your help and any other pointers.

Dan

Hi Dan,

I'll try to run through the scenario:

64-bit register type target (all registers have 64 bits).

all 32-bits are getting promoted to 64-bit integers

Problem:

MUL on i32 is getting promoted to MUL on i64

MUL on i64 is getting expanded to a library call in compiler-rt

the problem is that MUL32 gets promoted and then converted into a subroutine
call because it is now type i64, even though I want the MUL I32 to remain as an
operation in the architecture. MUL i32 would generate a 64-bit results from the
lower 32-bit portions of 64-bit source operands.

I think you should register custom type promotion logic, see
LegalizeIntegerTypes.cpp, line 40. When this gets passed a 32
bit multiplication, it should promote it to a 64 bit operation
using the target specific node that does your special multiplication.

Ciao, Duncan.

From Duncan:
I think you should register custom type promotion logic, see
LegalizeIntegerTypes.cpp, line 40. When this gets passed a 32
bit multiplication, it should promote it to a 64 bit operation
using the target specific node that does your special multiplication.

I think that's what he's doing.

From Dan:
           SDValue LHS = Op.getOperand(0);
            SDValue RHS = Op.getOperand(1);
            LHS = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i64, LHS);
            RHS = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i64, RHS);
            return DAG.getNode(XXXISD::MUL32, Op->getDebugLoc(), MVT::i64,
LHS, RHS);

I think you should return an ISD::TRUNCATE of that MUL32. The truncate
is only temporary and will be removed when the ADD you refer to later
gets promoted, but it keeps the types correct in the interim (you
don't have an "i32 add" of an i64 and an i32 and allows the legalizer
to register your MUL32 as the promoted value.

When you say that I'll have to take care of the node elsewhere, does that
mean in defining it as a proper way to lower? Like below?

Either lower it as you're talking about or select it from your
InstrInfo.td if there's an actual instruction that will do the work.

I would have thought that there was a default path for any XXXISD operation?

There's no default path for target-specific nodes. LLVM can't possibly
know what they're supposed to be.

Cheers.

Tim.