How to avoid multiple registers definitions in customInserter.

Hi,

I’m lowering some of the logical operators (by example the | operator) on integer32.
Sadly my target only provide native instruction on high and low parts of 32 bits registers.
So, I have to generate a sequence of two native instructions (LOR followed by HOR).
I’ve introduced an Pseudo instruction with a custom inserter.

def OR_A_oo : CLPPseudoInst<(ins FPUaOffsetOperand:$OffsetA,FPUaOffsetOperand:$OffsetB),(outs FPUaROUTADDRegisterClass:$FA_ROUTADD),
, [RFLAGA],
“# OR_A_oo”,
[(set FPUaROUTADDRegisterClass:$FA_ROUTADD,(or FPUaOffsetOperand:$OffsetA,FPUaOffsetOperand:$OffsetB))],NoItinerary>
{let usesCustomInserter = 1;}

The instructions selection and registers allocation are performed with the pseudo.

%4:fpuaoffsetclass = LOAD_A_r @a; FPUaOffsetClass:%4
%5:fpuaoffsetclass = LOAD_A_r @b; FPUaOffsetClass:%5
%6:fpuaroutaddregisterclass = OR_A_oo killed %5, killed %4, implicit-def dead %rflaga; FPUaROUTADDRegisterClass:%6 FPUaOffsetClass:%5,%4
%7:fpuaoffsetclass = COPY %6; FPUaOffsetClass:%7 FPUaROUTADDRegisterClass:%6
MOV_A_or killed %7, @c1; FPUaOffsetClass:%7
%8:fpuaoffsetclass = LOAD_A_r @c1; FPUaOffsetClass:%8

A virtual register %6 has been allocated for the out of the pseudo. So far, so good.

My customInserter (see below) is may be over simplistic.
After investigation on the code produce by my customInserter, I’ve noticed the following problems:

  1. %6 seems to be defined twice
  2. %5 is killed twice.

320B MOV_A_ro @a, def %4; FPUaOffsetClass:%4
336B MOV_A_ro @b, def %5; FPUaOffsetClass:%5
352B %6:fpuaroutaddregisterclass = LOR_A_oo killed %5, implicit-def %rflaga; FPUaROUTADDRegisterClass:%6 FPUaOffsetClass:%5
368B %6:fpuaroutaddregisterclass = HOR_A_oo killed %5, implicit-def %rflaga; FPUaROUTADDRegisterClass:%6 FPUaOffsetClass:%5
384B %7:fpuaoffsetclass = COPY %6; FPUaOffsetClass:%7 FPUaROUTADDRegisterClass:%6
400B MOV_A_or killed %7, @c1; FPUaOffsetClass:%7
416B MOV_A_ro @c1, def %8; FPUaOffsetClass:%8

Result: an assertion is raised… !!

********** PROCESS IMPLICIT DEFS **********
********** Function: _start
llc: /home/dte/eclipse-workspace/llvm/lib/CodeGen/MachineRegisterInfo.cpp:366: llvm::MachineInstr* llvm::MachineRegisterInfo::getVRegDef(unsigned int) const: Assertion `(I.atEnd() || std::next(I) == def_instr_end()) && “getVRegDef assumes a single definition or no definition”’ failed.
LLVMSymbolizer: error reading file: No such file or directory

Here is the inner part of my customInserter.
Are there any additional actions to perform during the customInserter.

MachineBasicBlock *
CLPTargetLowering::emitLOpcodeHOpcode(MachineInstr &MI,
MachineBasicBlock *MBB,
unsigned LOpcode,
unsigned HOpcode) const {
const TargetInstrInfo *TII = Subtarget->getInstrInfo();
DebugLoc Loc = MI.getDebugLoc();

const MachineOperand operand0 = MI.getOperand(0);
const MachineOperand operand1 = MI.getOperand(1);

BuildMI(*MBB, MI, Loc, TII->get(LOpcode))
.add(operand0)
.add(operand1);
BuildMI(*MBB, MI, Loc, TII->get(HOpcode))
.add(operand0)
.add(operand1);

MI.eraseFromParent();

return MBB;
}

Dominique Torette
System Architect
Rue des Chasseurs Ardennais - Liège Science Park - B-4031 Angleur
Tel: +32 (0) 4 361 81 11 - Fax: +32 (0) 4 361 81 20
www.spacebel.be

Hi Dominique,

Hi,

I’m lowering some of the logical operators (by example the | operator) on integer32.
Sadly my target only provide native instruction on high and low parts of 32 bits registers.
So, I have to generate a sequence of two native instructions (LOR followed by HOR).
I’ve introduced an Pseudo instruction with a custom inserter.

def OR_A_oo : CLPPseudoInst<(ins FPUaOffsetOperand:$OffsetA,FPUaOffsetOperand:$OffsetB),(outs FPUaROUTADDRegisterClass:$FA_ROUTADD),
                                                                , [RFLAGA],
                                                                "# OR_A_oo",
                                                                [(set FPUaROUTADDRegisterClass:$FA_ROUTADD,(or FPUaOffsetOperand:$OffsetA,FPUaOffsetOperand:$OffsetB))],NoItinerary>
                                                                {let usesCustomInserter = 1;}

The instructions selection and registers allocation are performed with the pseudo.

        %4:fpuaoffsetclass = LOAD_A_r @a; FPUaOffsetClass:%4
        %5:fpuaoffsetclass = LOAD_A_r @b; FPUaOffsetClass:%5
        %6:fpuaroutaddregisterclass = OR_A_oo killed %5, killed %4, implicit-def dead %rflaga; FPUaROUTADDRegisterClass:%6 FPUaOffsetClass:%5,%4
        %7:fpuaoffsetclass = COPY %6; FPUaOffsetClass:%7 FPUaROUTADDRegisterClass:%6
        MOV_A_or killed %7, @c1; FPUaOffsetClass:%7
        %8:fpuaoffsetclass = LOAD_A_r @c1; FPUaOffsetClass:%8

A virtual register %6 has been allocated for the out of the pseudo. So far, so good.

My customInserter (see below) is may be over simplistic.
After investigation on the code produce by my customInserter, I've noticed the following problems:
        1) %6 seems to be defined twice
        2) %5 is killed twice.

Both are true and that's what you've asked for :).

320B MOV_A_ro @a, def %4; FPUaOffsetClass:%4
336B MOV_A_ro @b, def %5; FPUaOffsetClass:%5
352B %6:fpuaroutaddregisterclass = LOR_A_oo killed %5, implicit-def %rflaga; FPUaROUTADDRegisterClass:%6 FPUaOffsetClass:%5
368B %6:fpuaroutaddregisterclass = HOR_A_oo killed %5, implicit-def %rflaga; FPUaROUTADDRegisterClass:%6 FPUaOffsetClass:%5
384B %7:fpuaoffsetclass = COPY %6; FPUaOffsetClass:%7 FPUaROUTADDRegisterClass:%6
400B MOV_A_or killed %7, @c1; FPUaOffsetClass:%7
416B MOV_A_ro @c1, def %8; FPUaOffsetClass:%8

Result: an assertion is raised… !!

********** PROCESS IMPLICIT DEFS **********
********** Function: _start
llc: /home/dte/eclipse-workspace/llvm/lib/CodeGen/MachineRegisterInfo.cpp:366: llvm::MachineInstr* llvm::MachineRegisterInfo::getVRegDef(unsigned int) const: Assertion `(I.atEnd() || std::next(I) == def_instr_end()) && "getVRegDef assumes a single definition or no definition"' failed.
LLVMSymbolizer: error reading file: No such file or directory

Here is the inner part of my customInserter.
Are there any additional actions to perform during the customInserter.

MachineBasicBlock *
CLPTargetLowering::emitLOpcodeHOpcode(MachineInstr &MI,
        MachineBasicBlock *MBB,
                  unsigned LOpcode,
                  unsigned HOpcode) const {
        const TargetInstrInfo *TII = Subtarget->getInstrInfo();
        DebugLoc Loc = MI.getDebugLoc();

        const MachineOperand operand0 = MI.getOperand(0);
        const MachineOperand operand1 = MI.getOperand(1);

        BuildMI(*MBB, MI, Loc, TII->get(LOpcode))
        .add(operand0)
        .add(operand1);
        BuildMI(*MBB, MI, Loc, TII->get(HOpcode))
        .add(operand0)

Here you write to the same result, which is forbidden while the
representation uses SSA.
You can fix that by
1. having two different virtual registers (createVirtualXXX) and then
put them together in the actual definition.
E.g.,
newV1 = LOR
newV2 = HOR
operand0 = REG_SEQUENCE newV1, low_subreg, newV2, high_subreg

2. Directly use the subreg indices
operand0 = IMPLICIT_DEF
operand0.low_subreg = LOR
operand0.high_subreg = HOR

3. Expand your pseudo after reg alloc.

#3 may be the right way to go for you, depending on whether or not you
have subreg indices. I.e., if you don't define subreg indices, then #1
and #2 are not possible.

        .add(operand1);

Here you extend the live-range of operand1, therefore, you need to
clear the kill flags on the first instruction.

Cheers,
-Quentin

For instructions that modify a register, but preserve some part of it, the typical approach is to add an extra operand that provides the pre-existing value of the register. For example

def LOR: Instruction {
   let OutOperandList = (outs RC:$Rd)
   // Rx will hold the incoming value, the high half of it will be preserved.
   let InOperandList = (ins RC:$Rx, RC:$Ra, RC:$Rb);
   let Constraints = "$Rd == $Rx";
}

Similarly for HOR. Then have an instruction sequence like

   %10 = IMPLICIT_DEF ; if there is no preexisting value
   %11 = LOR %10, %..., %...
   %12 = HOR %11, %..., %...

-Krzysztof

Hi Quentin,

Thanks _again_ for your advices. But, I would like to rephrase my understanding.

I was defining subreg indices, so this work is now half done. #1 and #2 could be options.
But I also have the feeling that #3 is simpler and the way to go...

I still have questions:
How can I expand your pseudo _after_ reg alloc ?
EmitInstrWithCustomInserter() is called only once during the "Expand ISel Pseudo-instructions" pass.
Do I have to reconfigure the sequence passes (only for my specific target) ?
In that case, isn't it simpler to try #1 or #2 ?

I've found MachineOperand.setIsKill(false) API in order to extend the live-range for operands of the first instruction of the sequence.
Is it what I need?

Regards, Dominique T.

Hi Dominique,

Hi Quentin,

Thanks _again_ for your advices. But, I would like to rephrase my understanding.

You're welcome!

I was defining subreg indices, so this work is now half done. #1 and #2 could be options.
But I also have the feeling that #3 is simpler and the way to go...

I still have questions:
How can I expand your pseudo _after_ reg alloc ?

There is a generic pass that you can piggyback on:
ExpandPostRA

It should be in your pipeline already.

The one hook that you need to implement is:
TragetInstrInfo::expandPostRAPseudo

Note: Your instruction needs to be defined with the isPseudo flag in
your .td file for this hook to be called.

Alternatively, you can write your own pass. Look at
AArch64ExpandPseudoPass for an example.

EmitInstrWithCustomInserter() is called only once during the "Expand ISel Pseudo-instructions" pass.
Do I have to reconfigure the sequence passes (only for my specific target) ?
In that case, isn't it simpler to try #1 or #2 ?

I've found MachineOperand.setIsKill(false) API in order to extend the live-range for operands of the first instruction of the sequence.
Is it what I need?

Yes, that's the one!

Cheers,
-Quentin