Nested instruction patterns rejected by GlobalISel when having registers in Defs

Hi Daniel,

Thanks for replying; I was hoping to get in touch with you on this issue.

I had a look at how SelectionIDAG does it when generating the matcher table, and it does consider the implicit defs as additional output. Here is the match table generated for the pattern:

/* 0*/ OPC_CheckOpcode, TARGET_VAL(ISD::SIGN_EXTEND),

/* 3*/ OPC_MoveChild0,

/* 4*/ OPC_CheckOpcode, TARGET_VAL(ISD::SHL),

/* 7*/ OPC_MoveChild0,

/* 8*/ OPC_CheckOpcode, TARGET_VAL(ISD::ANY_EXTEND),

/* 11*/ OPC_RecordChild0, // #0 = $src

/* 12*/ OPC_CheckChild0Type, MVT::i16,

/* 14*/ OPC_MoveParent,

/* 15*/ OPC_RecordChild1, // #1 = $imm

/* 16*/ OPC_MoveChild1,

/* 17*/ OPC_CheckOpcode, TARGET_VAL(ISD::Constant),

/* 20*/ OPC_CheckPredicate, 0, // Predicate_Imm_17_31_i16

/* 22*/ OPC_CheckType, MVT::i16,

/* 24*/ OPC_MoveParent,

/* 25*/ OPC_CheckType, MVT::i32,

/* 27*/ OPC_MoveParent,

/* 28*/ OPC_CheckType, MVT::i40,

/* 30*/ OPC_EmitNode1, TARGET_VAL(TargetOpcode::IMPLICIT_DEF), 0,

MVT::i40, 0/#Ops/, // Results = #2

/* 36*/ OPC_EmitNode1, TARGET_VAL(TargetOpcode::IMPLICIT_DEF), 0,

MVT::i32, 0/#Ops/, // Results = #3

/* 42*/ OPC_EmitInteger, MVT::i32, OurTarget::hi16, // Results = #4

/* 45*/ OPC_EmitNode1, TARGET_VAL(TargetOpcode::INSERT_SUBREG), 0,

MVT::i32, 3/#Ops/, 3, 0, 4, // Results = #5

/* 54*/ OPC_EmitNode1, TARGET_VAL(OurTarget::clearLo32_pseudo), 0,

MVT::i32, 1/#Ops/, 5, // Results = #6

/* 61*/ OPC_EmitInteger, MVT::i16, 0,

/* 64*/ OPC_EmitRegister, MVT::i16, 0 /zero_reg/,

/* 67*/ OPC_EmitInteger, MVT::i16, 0,

/* 70*/ OPC_EmitNode2, TARGET_VAL(OurTarget::shfts_a32_imm7), 0,

MVT::i32, MVT::i16, 5/#Ops/, 6, 1, 7, 8, 9, // Results = #10 #11

/* 82*/ OPC_EmitInteger, MVT::i32, OurTarget::acc,

/* 85*/ OPC_EmitNode1, TARGET_VAL(TargetOpcode::INSERT_SUBREG), 0,

MVT::i40, 3/#Ops/, 2, 10, 12, // Results = #13

/* 94*/ OPC_EmitInteger, MVT::i16, 0,

/* 97*/ OPC_EmitRegister, MVT::i16, 0 /zero_reg/,

/* 100*/ OPC_EmitInteger, MVT::i16, 0,

/* 103*/ OPC_MorphNodeTo1, TARGET_VAL(OurTarget::sext_a32), 0,

MVT::i40, 4/#Ops/, 13, 14, 15, 16, …

The line of interest here is the one below /* 70 /. There, we can see that the instruction produces two results of type MVT::i32 (the value produced by the instruction) respectively MVT::i16 (the CCReg updated by the instruction). These are labeled as results #10 respectively #11. Looking at operand identifiers after /#Ops*/, we can see that only #10 is used by the rest of the resulting pattern (by INSERT_SUBREG), which is as intended.

However, in the destination pattern declared in the *.td file, there is no information pertaining to which of the results should be used by the parent node. Since only tree-shaped patterns are allowed, SelectionIDAG must somehow decide which of the results are to be used by the parent node. And this decision is taken at lines 869-870 in DAGISelMatcherGen.cpp:

unsigned FinalNumOps = InstOps.size() + NumSubOps;

while (InstOps.size() < FinalNumOps) {

const TreePatternNode *Child = N->getChild(ChildNo);

unsigned BeforeAddingNumOps = InstOps.size();

EmitResultOperand(Child, InstOps);

assert(InstOps.size() > BeforeAddingNumOps && “Didn’t add any operands”);

// If the operand is an instruction and it produced multiple results, just

// take the first one.

if (!Child->isLeaf() && Child->getOperator()->isSubClassOf(“Instruction”))

InstOps.resize(BeforeAddingNumOps+1);

++ChildNo;

}

In other words, if a child produces more than one result SelectionIDAG always takes the first result. Those two lines originate from a patch by Chris Lattner (r99725), so presumably this is the correct way of doing it. If GlobalISel were to do the same, it would solve the issue I’m having with our patterns.

Best regards,

Gabriel

Hi Daniel,

Thanks for replying; I was hoping to get in touch with you on this issue.

I had a look at how SelectionIDAG does it when generating the matcher table, and it does consider the implicit defs as additional output. Here is the match table generated for the pattern:

/* 0*/ OPC_CheckOpcode, TARGET_VAL(ISD::SIGN_EXTEND),
/* 3*/ OPC_MoveChild0,
/* 4*/ OPC_CheckOpcode, TARGET_VAL(ISD::SHL),
/* 7*/ OPC_MoveChild0,
/* 8*/ OPC_CheckOpcode, TARGET_VAL(ISD::ANY_EXTEND),
/* 11*/ OPC_RecordChild0, // #0 = $src
/* 12*/ OPC_CheckChild0Type, MVT::i16,
/* 14*/ OPC_MoveParent,
/* 15*/ OPC_RecordChild1, // #1 = $imm
/* 16*/ OPC_MoveChild1,
/* 17*/ OPC_CheckOpcode, TARGET_VAL(ISD::Constant),
/* 20*/ OPC_CheckPredicate, 0, // Predicate_Imm_17_31_i16
/* 22*/ OPC_CheckType, MVT::i16,
/* 24*/ OPC_MoveParent,
/* 25*/ OPC_CheckType, MVT::i32,
/* 27*/ OPC_MoveParent,
/* 28*/ OPC_CheckType, MVT::i40,
/* 30*/ OPC_EmitNode1, TARGET_VAL(TargetOpcode::IMPLICIT_DEF), 0,
               MVT::i40, 0/*#Ops*/, // Results = #2
/* 36*/ OPC_EmitNode1, TARGET_VAL(TargetOpcode::IMPLICIT_DEF), 0,
               MVT::i32, 0/*#Ops*/, // Results = #3
/* 42*/ OPC_EmitInteger, MVT::i32, OurTarget::hi16, // Results = #4
/* 45*/ OPC_EmitNode1, TARGET_VAL(TargetOpcode::INSERT_SUBREG), 0,
               MVT::i32, 3/*#Ops*/, 3, 0, 4, // Results = #5
/* 54*/ OPC_EmitNode1, TARGET_VAL(OurTarget::clearLo32_pseudo), 0,
               MVT::i32, 1/*#Ops*/, 5, // Results = #6
/* 61*/ OPC_EmitInteger, MVT::i16, 0,
/* 64*/ OPC_EmitRegister, MVT::i16, 0 /*zero_reg*/,
/* 67*/ OPC_EmitInteger, MVT::i16, 0,
/* 70*/ OPC_EmitNode2, TARGET_VAL(OurTarget::shfts_a32_imm7), 0,
               MVT::i32, MVT::i16, 5/*#Ops*/, 6, 1, 7, 8, 9, // Results = #10 #11
/* 82*/ OPC_EmitInteger, MVT::i32, OurTarget::acc,
/* 85*/ OPC_EmitNode1, TARGET_VAL(TargetOpcode::INSERT_SUBREG), 0,
               MVT::i40, 3/*#Ops*/, 2, 10, 12, // Results = #13
/* 94*/ OPC_EmitInteger, MVT::i16, 0,
/* 97*/ OPC_EmitRegister, MVT::i16, 0 /*zero_reg*/,
/* 100*/ OPC_EmitInteger, MVT::i16, 0,
/* 103*/ OPC_MorphNodeTo1, TARGET_VAL(OurTarget::sext_a32), 0,
               MVT::i40, 4/*#Ops*/, 13, 14, 15, 16, ...

The line of interest here is the one below /* 70 */. There, we can see that the instruction produces two results of type MVT::i32 (the value produced by the instruction) respectively MVT::i16 (the CCReg updated by the instruction). These are labeled as results #10 respectively #11. Looking at operand identifiers after /*#Ops*/, we can see that only #10 is used by the rest of the resulting pattern (by INSERT_SUBREG), which is as intended.

However, in the destination pattern declared in the *.td file, there is no information pertaining to which of the results should be used by the parent node. Since only tree-shaped patterns are allowed, SelectionIDAG must somehow decide which of the results are to be used by the parent node. And this decision is taken at lines 869-870 in DAGISelMatcherGen.cpp:

    ...
    unsigned FinalNumOps = InstOps.size() + NumSubOps;
    while (InstOps.size() < FinalNumOps) {
      const TreePatternNode *Child = N->getChild(ChildNo);
      unsigned BeforeAddingNumOps = InstOps.size();
      EmitResultOperand(Child, InstOps);
      assert(InstOps.size() > BeforeAddingNumOps && "Didn't add any operands");

      // If the operand is an instruction and it produced multiple results, just
      // take the first one.
      if (!Child->isLeaf() && Child->getOperator()->isSubClassOf("Instruction"))
        InstOps.resize(BeforeAddingNumOps+1);

      ++ChildNo;
    }
    ...

In other words, if a child produces more than one result SelectionIDAG always takes the first result. Those two lines originate from a patch by Chris Lattner (r99725), so presumably this is the correct way of doing it. If GlobalISel were to do the same, it would solve the issue I'm having with our patterns.

Thanks for looking into that.

There's three questions I feel I have with the SelectionDAG behaviour:
* If Defs=[A] causes SelectionDAG to add a result to the intermediate SDNode, why doesn't Defs=[A,B] add two?
* Given that SDNode's support MVT::Other results, why doesn't MVT::Other behave the same way?
* If nothing can use that result, why model it? I suppose custom C++ instruction selection can still see it so maybe that's it but if so, what code is relying on it?
The inconsistency makes me rather suspicious about it. It feels like it was a hack for a specific purpose but I don't know what that purpose is. Overall, I'm not keen to carry it forwards unless there's a clear reason it's there.

Leaving that aside for the moment. I see a difference in modelling that makes directly importing the SelectionDAG behaviour tricky. SelectionDAG uses SDNode's which keep the result list (+ the first implicit def with known type) and the operand list separate. GlobalISel uses the MachineInstr's which uses a single list in a fixed order. That order is outputs, inputs, implicits so we would need to split the results list into two pieces and defer adding the implicit def to the place it's currently added to handle it the SelectionDAG way. That being the case (and assuming we don't find a good reason for SelectionDAG's behaviour), I think the best approach for this might be to adjust the numbers that originate from GetNumNodeResults() for the `InstInfo.HasOneImplicitDefWithKnownVT(CDP.getTargetInfo()) !=MVT::Other` case to undo the adjustment CodeGenDAGPatterns does. That's an ugly solution but I think the other ones I can think of are worse.

Hi Daniel,

Thanks for replying; I was hoping to get in touch with you on this issue.

I had a look at how SelectionIDAG does it when generating the matcher table, and it does consider the implicit defs as additional output. Here is the match table generated for the pattern:

/* 0*/ OPC_CheckOpcode, TARGET_VAL(ISD::SIGN_EXTEND),

/* 3*/ OPC_MoveChild0,

/* 4*/ OPC_CheckOpcode, TARGET_VAL(ISD::SHL),

/* 7*/ OPC_MoveChild0,

/* 8*/ OPC_CheckOpcode, TARGET_VAL(ISD::ANY_EXTEND),

/* 11*/ OPC_RecordChild0, // #0 = $src

/* 12*/ OPC_CheckChild0Type, MVT::i16,

/* 14*/ OPC_MoveParent,

/* 15*/ OPC_RecordChild1, // #1 = $imm

/* 16*/ OPC_MoveChild1,

/* 17*/ OPC_CheckOpcode, TARGET_VAL(ISD::Constant),

/* 20*/ OPC_CheckPredicate, 0, // Predicate_Imm_17_31_i16

/* 22*/ OPC_CheckType, MVT::i16,

/* 24*/ OPC_MoveParent,

/* 25*/ OPC_CheckType, MVT::i32,

/* 27*/ OPC_MoveParent,

/* 28*/ OPC_CheckType, MVT::i40,

/* 30*/ OPC_EmitNode1, TARGET_VAL(TargetOpcode::IMPLICIT_DEF), 0,

MVT::i40, 0/#Ops/, // Results = #2

/* 36*/ OPC_EmitNode1, TARGET_VAL(TargetOpcode::IMPLICIT_DEF), 0,

MVT::i32, 0/#Ops/, // Results = #3

/* 42*/ OPC_EmitInteger, MVT::i32, OurTarget::hi16, // Results = #4

/* 45*/ OPC_EmitNode1, TARGET_VAL(TargetOpcode::INSERT_SUBREG), 0,

MVT::i32, 3/#Ops/, 3, 0, 4, // Results = #5

/* 54*/ OPC_EmitNode1, TARGET_VAL(OurTarget::clearLo32_pseudo), 0,

MVT::i32, 1/#Ops/, 5, // Results = #6

/* 61*/ OPC_EmitInteger, MVT::i16, 0,

/* 64*/ OPC_EmitRegister, MVT::i16, 0 /zero_reg/,

/* 67*/ OPC_EmitInteger, MVT::i16, 0,

/* 70*/ OPC_EmitNode2, TARGET_VAL(OurTarget::shfts_a32_imm7), 0,

MVT::i32, MVT::i16, 5/#Ops/, 6, 1, 7, 8, 9, // Results = #10 #11

/* 82*/ OPC_EmitInteger, MVT::i32, OurTarget::acc,

/* 85*/ OPC_EmitNode1, TARGET_VAL(TargetOpcode::INSERT_SUBREG), 0,

MVT::i40, 3/#Ops/, 2, 10, 12, // Results = #13

/* 94*/ OPC_EmitInteger, MVT::i16, 0,

/* 97*/ OPC_EmitRegister, MVT::i16, 0 /zero_reg/,

/* 100*/ OPC_EmitInteger, MVT::i16, 0,

/* 103*/ OPC_MorphNodeTo1, TARGET_VAL(OurTarget::sext_a32), 0,

MVT::i40, 4/#Ops/, 13, 14, 15, 16, …

The line of interest here is the one below /* 70 /. There, we can see that the instruction produces two results of type MVT::i32 (the value produced by the instruction) respectively MVT::i16 (the CCReg updated by the instruction). These are labeled as results #10 respectively #11. Looking at operand identifiers after /#Ops*/, we can see that only #10 is used by the rest of the resulting pattern (by INSERT_SUBREG), which is as intended.

However, in the destination pattern declared in the *.td file, there is no information pertaining to which of the results should be used by the parent node. Since only tree-shaped patterns are allowed, SelectionIDAG must somehow decide which of the results are to be used by the parent node. And this decision is taken at lines 869-870 in DAGISelMatcherGen.cpp:

unsigned FinalNumOps = InstOps.size() + NumSubOps;

while (InstOps.size() < FinalNumOps) {

const TreePatternNode *Child = N->getChild(ChildNo);

unsigned BeforeAddingNumOps = InstOps.size();

EmitResultOperand(Child, InstOps);

assert(InstOps.size() > BeforeAddingNumOps && “Didn’t add any operands”);

// If the operand is an instruction and it produced multiple results, just

// take the first one.

if (!Child->isLeaf() && Child->getOperator()->isSubClassOf(“Instruction”))

InstOps.resize(BeforeAddingNumOps+1);

++ChildNo;

}

In other words, if a child produces more than one result SelectionIDAG always takes the first result. Those two lines originate from a patch by Chris Lattner (r99725), so presumably this is the correct way of doing it. If GlobalISel were to do the same, it would solve the issue I’m having with our patterns.

Thanks for looking into that.

There’s three questions I feel I have with the SelectionDAG behaviour:

  • If Defs=[A] causes SelectionDAG to add a result to the intermediate SDNode, why doesn’t Defs=[A,B] add two?

That’s because CodeGenDAGPatterns::GetNumNodeResults() does the following:

if (Operator->isSubClassOf(“Instruction”)) {

CodeGenInstruction &InstInfo = CDP.getTargetInfo().getInstruction(Operator);

unsigned NumDefsToAdd = InstInfo.Operands.NumDefs;

// Subtract any defaulted outputs.

for (unsigned i = 0; i != InstInfo.Operands.NumDefs; ++i) {

Record *OperandNode = InstInfo.Operands[i].Rec;

if (OperandNode->isSubClassOf(“OperandWithDefaultOps”) &&

!CDP.getDefaultOperand(OperandNode).DefaultOps.empty())

–NumDefsToAdd;

}

// Add on one implicit def if it has a resolvable type.

if (InstInfo.HasOneImplicitDefWithKnownVT(CDP.getTargetInfo()) !=MVT::Other)

++NumDefsToAdd;

return NumDefsToAdd;

}

In other words, when building the patterns any additional Defs are simply ignored, and I think this is due to how the functionality evolved. Here are the commits related to this issue:

commit f144725ebcd859009c6ee96ead4a2d09ab3d277c

Author: Chris Lattner sabre@nondot.org

major surgery on tblgen: generalize TreePatternNode

to maintain a list of types (one for each result of

the node) instead of a single type. There are liberal

hacks added to emulate the old behavior in various

situations, but they can start disolving now.

llvm-svn: 98999

commit d44966f26de2e7ca05cb888a1ea5b5bdc22a65c0

Author: Chris Lattner sabre@nondot.org

continue pushing tblgen’s support for nodes with multiple

results forward. We can now handle an instruction that

produces one implicit def and one result instead of one or

the other when not at the root of the pattern.

llvm-svn: 99725

commit 7bc5d9b576827b4c629e7dd6707804ee61ef18c9

Author: Chris Lattner sabre@nondot.org

hoist some funky logic into CodeGenInstruction

from two places in CodeGenDAGPatterns.cpp, and

use it in DAGISelMatcherGen.cpp instead of using

an incorrect predicate that happened to get lucky

on our current targets.

llvm-svn: 99726

commit 3a8eb896c9d6696402bf113653c4c20b237c7009

Author: Craig Topper craig.topper@gmail.com

[Tablegen] Attempt to add support for patterns containing nodes with multiple results.

This is needed for AVX512 masked scatter/gather support.

The R600 change is necessary to remove a hack that was working around the lack of multiple results.

llvm-svn: 232798

My guess is that the support for multiple results simply stopped at one result from outs and one from Defs.

  • Given that SDNode’s support MVT::Other results, why doesn’t MVT::Other behave the same way?

Hm, cannot answer that; I don’t know what SDNode’s support for MVT::Other looks like.

  • If nothing can use that result, why model it? I suppose custom C++ instruction selection can still see it so maybe that’s it but if so, what code is relying on it?

Perhaps the commit above from Crag Topper can answer that. I haven’t looked deeper into that, though.

The inconsistency makes me rather suspicious about it. It feels like it was a hack for a specific purpose but I don’t know what that purpose is. Overall, I’m not keen to carry it forwards unless there’s a clear reason it’s there.

Leaving that aside for the moment. I see a difference in modelling that makes directly importing the SelectionDAG behaviour tricky. SelectionDAG uses SDNode’s which keep the result list (+ the first implicit def with known type) and the operand list separate. GlobalISel uses the MachineInstr’s which uses a single list in a fixed order. That order is outputs, inputs, implicits so we would need to split the results list into two pieces and defer adding the implicit def to the place it’s currently added to handle it the SelectionDAG way. That being the case (and assuming we don’t find a good reason for SelectionDAG’s behaviour), I think the best approach for this might be to adjust the numbers that originate from GetNumNodeResults() for the InstInfo.HasOneImplicitDefWithKnownVT(CDP.getTargetInfo()) !=MVT::Other case to undo the adjustment CodeGenDAGPatterns does. That’s an ugly solution but I think the other ones I can think of are worse.

I am probably missing something here, but why does GlobalISel need to care about implicit defs at all? Since there seems to be no way of referring to such values in the patterns, why not just ignore them from instruction selection’s point of view? As far as I know, only the register allocator needs to know about which registers are implicitly used or defined by an instruction.