Strange ISel Bug

ISel fails in my back-end to perform instruction selection and I think
it is a bug in ISel.

My target is a CISC architecture. For example it can add two values
and store the result indirectly in memory: ADD (A0), R0, R1 add the
values in register R0 and R1 and stores the result in memory
pointed to by register A0. Maybe, I should add here that every
instruction has two PredicateOperands and one OptionalDefOperand
for conditional execution support.

I have defined an instruction for operations like ADD with this
addressing modes described above and it works as expected. I have
verified that.

But ISel fails for the operation ADDC and ADDE in optimization
level O0 for the described addressing mode (STORE). I was worried
that this is a fast-isel problem.

The SelectionDAG for ISel looks like:

Optimized legalized selection DAG: %bb.2 'test_main:for.body'
SelectionDAG has 45 nodes:
  t0: ch = EntryToken
  ...
  t55: i32 = TARGETISD::convertBitAddressToByteAddress t54
        t28: i32,glue = adde t46, t44, t27:1
      t35: ch = store<(store (s32) into %ir.res, align 8)> t14, t28, t34, undef:i32
      t38: ch = store<(store (s32) into %ir.res + 4, basealign 8)> t14, t27, t37, undef:i32
    t31: ch = TokenFactor t35, t38
  t17: ch = br t31, BasicBlock:ch<for.inc 0x7f9f29889780>

This is ok, but in ISel:

ISEL: Starting selection on root node: t35: ch = store<(store (s32) into %ir.res, align 8)> t14, t28, t33, undef:i32
ISEL: Starting pattern match
  Initial Opcode index to 5
  ...
  Morphed node: t35: ch,glue = ADDE_x<Mem:(store (s32) into %ir.res, align 8)> t44, t28, t46, t33, Register:i32 $noreg, Register:i32 $noreg, t14, t27:1
ISEL: Match complete!

This is completely wrong:

  1. operation ADDE operands t44 and t46 are swapped, which is
    basically ok, because ADDE is commutative, but might give a hint
  2. operand t28 is the node with operation ADDE itself which is
    done by the instruction ADDE_x; this is a wrong/illegal operand,
    a second match on operation ADDE is kept alive
  3. operands t33, Register:i32 $noreg, Register:i32 $noreg are
    wrong, the three operands that I add (see comment above) are Register:i32 $noreg, TargetConstant:i32<0>, Register:i32 $noreg
  4. the morphed node has 8 operands but should have 7, this raises an
    abort

In general the morphed node should have the operands t46, t44, Register:i32 $noreg, TargetConstant:i32<0>, Register:i32 $noreg, t14, t27:1.

All other matches seem to be correct, except an additional illegal
match on an ADDC operation, which has no glue edge. But no other
match has a chain and a glue edge.

Any hints how I can track this down? I went through all my uses of
getMachineNode and BuildMI, but I can’t find any interference with
the morphed node.

(Sorry if I’m stating things that are already obvious to you.)

What I would do is:

  • Print the isel debug output (like you do already)
  • Check the sequence of matching by looking at where you’re jumping in the matching table:
    – Follow the to xxx printed at the end of each line, and
    – Look for that number in the comments of <yourBuild>/lib/Target/<yourTarget>/<yourTarget>GenDAGISel.inc

The idea here is to identify which pattern applies in the faulty match, then double check that the pattern looks correct in your related .td file.

If your pattern looks correct, then we can see if this is a bug with the SDISel framework.

This requires LLVM_OMIT_DAGISEL_COMMENTS to be OFF in cmake config.

1 Like

The pattern for all binary operations Op is: (store (Op BRegs:$srca, BRegs:$srcb), ARegs:$dst). This is correct for operation ADDC, too, because the glue edge/operand is not matched. Correct?

The generated matcher contains the following code:

/*  8867*/   /*SwitchOpcode*/ 70, TARGET_VAL(ISD::ADDC),// ->8940
/*  8870*/    OPC_RecordNode, // #1 = 'addc' glue output node
/*  8871*/    OPC_RecordChild0, // #1 = $srca
/*  8872*/    OPC_RecordChild1, // #2 = $srcb
/*  8873*/    OPC_CheckType, MVT::i32,
/*  8875*/    OPC_MoveParent,
/*  8876*/    OPC_RecordChild2, // #3 = $dst
/*  8877*/    OPC_Scope, 27, /*->8906*/ // 2 children in Scope
/*  8879*/     OPC_CheckChild2Type, MVT::i32,
/*  8881*/     OPC_CheckPredicate, 1, // Predicate_unindexedstore
/*  8883*/     OPC_CheckPredicate, 2, // Predicate_store
/*  8885*/     OPC_EmitMergeInputChains1_0,
/*  8886*/     OPC_EmitRegister, MVT::i32, 0 /*zero_reg*/,
/*  8889*/     OPC_EmitRegister, MVT::i32, 0 /*zero_reg*/,
/*  8892*/     OPC_EmitRegister, MVT::i32, 0 /*zero_reg*/,
/*  8895*/     OPC_MorphNodeTo0, TARGET_VAL(MYTGT::ADDC_mrr), 0|OPFL_Chain|OPFL_GlueOutput|OPFL_MemRefs,
                   6/*#Ops*/, 3, 1, 2, 4, 5, 6,
               // Src: (st (addc:{ *:[i32] } BRegs:{ *:[i32] }:$srca, BRegs:{ *:[i32] }:$srcb), ARegs:{ *:[i32] }:$dst)<<P:Predicate_unindexedstore>><<P:Predicate_store>> - Complexity = 7
               // Dst: (ADDC_mrr ARegs:{ *:[i32] }:$dst, BRegs:{ *:[i32] }:$srca, BRegs:{ *:[i32] }:$srcb)

At state 8870 the ADDC node is recorded, because it has an output Glue-edge.

Please note:

  1. the generator matcher table for ADD is identical without RecoredNode@8870, esp the operand indices are the same
  2. for input Glue-edges the matcher supports the opcode CaptureGlueInput
  3. there is nothing special for output Glue-edges.
  4. LOAD nodes are recorded as well, but followed by the opcode CheckFoldableChainNode to perform some action

The ADDC node is: t27: i32,glue = addc t45, t41

After adding some additional debug info in ISel, I got the following log:

ISEL: Starting selection on root node: t38: ch = store<(store (s32) into %ir.res + 4, basealign 8)> t14, t27, t36, undef:i32
  Initial Opcode index to 5
 .|RecordedNodes|=0: record node without parent
 .|RecordedNodes|=1: record child #1
  Match failed at index 12
  Continuing at 279
  OpcodeSwitch from 282 to 475
 .|RecordedNodes|=1: record node with parent
 .|RecordedNodes|=2: record node with parent
  Match failed at index 482
  Continuing at 2545
 .|RecordedNodes|=1: record child #1
  Match failed at index 2551
  Continuing at 2596
 .|RecordedNodes|=2: record child #2
 ....
  Match failed at index 8373
  Continuing at 8407
  Continuing at 8408
  OpcodeSwitch from 8411 to 8870
 .|RecordedNodes|=1: record node with parent
 .|RecordedNodes|=2: record child #0
 .|RecordedNodes|=3: record child #1
 .move parent
 .|RecordedNodes|=4: record child #2
 .|RecordedNodes|=5: record (emit) register:0
 .|RecordedNodes|=6: record (emit) register:0
 .|RecordedNodes|=7: record (emit) register:0
 .num VTs:0
 .RecordedNodes.size():8
 .offset:2 some glue AND edge?
 .use node index#3 for op#0
 .use node index#1 for op#1
 .use node index#2 for op#2
 .use node index#4 for op#3
 .use node index#5 for op#4
 .use node index#6 for op#5
 .morphnode: oldglueresult:-1 oldchainresult:0
  Morphed node: t38: ch,glue = ADDC_mrr<Mem:(store (s32) into %ir.res + 4, basealign 8)> t41, t27, t45, t36, Register:i32 $noreg, Register:i32 $noreg, t14
ISEL: Match complete!

At state 8870 the ADDC node was recorded and it is the second node in the array RecordedNodes after the STORE without a parent.
The morphed node seems to be ok, but if you check its operands, you will notice, that the operands are: (addc.op0, addc, addc.op1, noreg, noreg, storeaddr). This is wrong, and should be (storeaddr, addc.op0, addc.op1, noreg, noreg, noreg)!

I dare to say that this recorded node for the output Glue-edge messes up the operand index computation, and therefore the operands of the morphed node.
(I can verify this by modifying the indices (+1) in the generated matcher table.)

But if the indices are correct, the function SelectionDAGISel::MorphNode() wants to derive the output Glue-edge from the DAG’s root, and that is a STORE node, where ISel can not find the Glue-edge. Therefore ISel can’t replace the use of ADDC output Glue-edge by the new morphed node’s output Glue-edge.

The pattern looks fine.

This part of the generated table looks fishy though:

/*  8870*/    OPC_RecordNode, // #1 = 'addc' glue output node <-- HERE
/*  8871*/    OPC_RecordChild0, // #1 = $srca <-- and HERE we have two #1

It is as if the tablegen backend failed to recognize that it already registered operand #1. I would suggest to look into why is that.

If I’m not mistaken, essentially at this point what is recorded is:

1     2     3     4          5      6      7
addc, srca, srcb, storeaddr, noreg, noreg, noreg

Now when we build the operands list for the new op we take:

3, 1, 2, 4, 5, 6

Which gives this sequence:

srcb, addc, srca, storeaddr, noreg, noreg, [chain == t14]

It doesn’t match the order of operand that you’re describing though, but I think I got it right.

That’s not going to help you get to the bottom of this, but ADDC is technically deprecated and you should use UADDO_CARRY instead.

If you do the transition this is likely going to fix the problem because IIRC UADDO_CARRY doesn’t have a glue operand.

Anyhow, I believe there’s a bug in the table generation here.