How can I get rid of "OPFL_Chain" in myCPUGenInstrInfo.inc

hi Tim,guys,

it was regarding splitting 16-bit ADDC to two 8-bit ADDC+ADDE.

the 8-bit ADDE instruction is defined as:

let Constraints="$dst=$op0",mayStore=1,
hasSideEffects=0,neverHasSideEffects=1 in

def ADDErm: myInstr
<0x0,
(outs Intregs:$dst) (ins Intregs:$op0,MEMi:$op1),
"",
[set IntRegs:$dest (adde IntRegs:$op0, (load ADDRi:$op1))]

very unlucky, this instruction failed.

in the generated match table, there was flag OPFL_Chain.

it caused a token factor node to be created in switch case
OPC_EmitMergeInputChains in SelectCodeCommon.

very bad, all uses of input chain was replaced with ADDErm Node.

so the created token factor node depends on the ADDErm node after the
replacement.

very bad the ADDCrm node depends on above token factor.

because of the glue, a cycle formed. ADDE-->ADDC-->TF-->ADDE.

after I removed the flag OPFL_Chain in the match table, the problem was
gone.

So, my question was obvious:how could I make the table gen tool not add
the said flag?

thanks

hilbert

Hi Hilbert,

let Constraints="$dst=$op0",mayStore=1,

Are you sure you mean "mayStore" here and not "mayLoad"?

very bad, all uses of input chain was replaced with ADDErm Node.

Since your ADDErm is also a load, it is going to need an input chain
of some kind (and hence OPFL_Chain) so that's not surprising on its
own.

so the created token factor node depends on the ADDErm node after the
replacement.

I don't suppose you could post the output of "-view-isel-dags" &
"-view-sched-dags"? I've got some ideas on how to reproduce what
you're seeing, but I need the exact DAGs to have much chance.

It sounds like it *might* be a bug in HandleMergeInputChains, if it
doesn't take glue into account somehow.

Cheers.

Tim.

guys,

1)i made a mistake in my post.
the said TF node was created when Selected() was called in ADDC node.

  1. the source code under test

short a,b;
void test()
{
a+=b;
}

3)the DAG after ADDC was seleced:

Select node:
0x4977a20: ch,glue = <<Unknown Machine Node #65419>> 0x4972bd0, 0x49731d0, 0x4976c20, 0x49730d0Mem:LD1[@a](align=2)
Result node:
0x4977a20: ch,glue = <<Unknown Machine Node #65419>> 0x4972bd0, 0x49731d0, 0x4976c20, 0x49730d0Mem:LD1[@a](align=2)
Result DAG:
SelectionDAG has 21 nodes:
0x49606f0: ch = EntryToken [ORD=1] [ID=0]

0x4972cd0: i8 = undef [ORD=1] [ID=2]

0x49735d0: i8 = Constant<1> [ID=3]

0x4977920: i8 = TargetGlobalAddress<i16* @b> 0 [ID=4]

0x4972fd0: i8 = SPISD::GLOBAL_TRANSFER 0x4977920 [ID=6]

0x49606f0:
0x4972fd0:
0x4972cd0:
0x4976c20: i8,ch = load 0x49606f0, 0x4972fd0, 0x4972cd0<LD1@b> [ID=8]

0x49606f0:
0x4972fd0:
0x49735d0:
0x4976d20: i8 = add 0x4972fd0, 0x49735d0 [ID=9]

0x4972cd0:
0x4976e20: i8,ch = load 0x49606f0, 0x4976d20, 0x4972cd0<LD1[@b+1]> [ID=12]

0x49606f0:
0x4972ed0: i8 = TargetGlobalAddress<i16* @a> 0 [ID=5]

0x4977020: i8 = SPISD::GLOBAL_TRANSFER 0x4972ed0 [ID=7]

0x49735d0:
0x49736d0: i8 = add 0x4977020, 0x49735d0 [ID=11]

0x4972cd0:
0x49737d0: i8,ch = load 0x49606f0, 0x49736d0, 0x4972cd0<LD1[@a+1]> [ID=14]

0x4972bd0:
0x49731d0: i32 = TargetGlobalAddress<i16* @a> 0

0x4976c20:
0x49606f0:
0x49737d0:
0x4976c20:
0x4976e20:
0x49730d0: ch = TokenFactor 0x49606f0, 0x49737d0:1, 0x4976c20:1, 0x4976e20:1

0x4977a20: ch,glue = addcmr 0x4972bd0, 0x49731d0, 0x4976c20, 0x49730d0Mem:LD1[@a](align=2)

0x4972bd0: i8 = Register %noreg

0x4976f20: i32 = TargetGlobalAddress<i16* @a> + 1

0x4977a20:
0x4972bd0:
0x4976f20:
0x49737d0:
0x4976e20:
0x4977a20:
0x4977820: i8,glue = adde 0x49737d0, 0x4976e20, 0x4977a20:1 [ID=16]

0x4972bd0:
0x4976f20:
0x4977a20:
0x49738d0: ch = MOVmr 0x4972bd0, 0x4976f20, 0x4977820, 0x4972bd0, 0x4976f20, 0x4977a20

0x4977c80: ch = TokenFactor 0x4977a20, 0x49738d0

0x49733d0: ch = RET 0x4977c80

Hi Hilbert,

i had not understood the purpose of a TF node.
my understanding was, the operands were not loaded together in one node. they were loaded separately.

The loads happen separately, but the TokenFactor node says that
whichever order they do happen in, they *both* have to be before
whatever comes next (the user of the TokenFactor). It's there to
constrain LLVM's scheduling.

4) replace any operands of the TF Node with ADDE did not make sense.

It does to me. Before we were saying that the load of "a+1" had to
happen before X (which might be, for example, a subsequent write to
the 16-bit variable "a"). If that load has been folded into an ADDE,
then that ADDE has to happen before X too.

3)the DAG after ADDC was seleced:

Now that's one thing I don't understand. The ADDC should be "above"
the ADDE in the DAG since the ADDE depends on the output of the ADDC.
How did it get selected first?

The DAGs you posted aren't really that useful I'm afraid: deciphering
the text layout is a massive pain in the neck; the machine nodes are
given purely by number (I have no idea what they are); your first dump
appears to be from half way through ISel (which *is* interesting), but
I'd mostly like before & after to begin with.

Cheers.

Tim.

Tim and all,

thanks to Tim, now i started to understand the basic theory.

my understanding of the problem was as follows.

  1. 16-bit ADD/SUB, has four loads: ld a,ld a+1, ld b,ld b+1. they were the operands of the said TF node
  2. ADDC happends after ld a,lb b, so the TF node happens before ADDC
  3. ADDE happends after ADDC
  4. ADDE modifies a+1, so it must happen before the subsequent STORE. so the input chain update makes sense.

the problem here was. that the TF happends before ADDC is not wrong, but ld a+1, ld b+1 can happen after ADDC.

if we split the TF node into two:
TF1:ld a,ld b
TF2: ld a_1,ld b+1
then the problem should be gone.

Cheers.

hilbert