Errononous scheduling of COPY instruction.

Hi,

I’ve instruction scheduling problem that I cannot further investigate by myself… Could someone give me some clues?

After Instruction selection, here is part of the generated instruction.

NOP

MOV_AB_ro @s1, %fab_roff0

%6:fpuaoffsetclass = COPY %fab_roff0; FPUaOffsetClass:%6

MOV_A_oo %6, def %5; FPUaOffsetClass:%6,%5

MOVSUTO_A_iSLo 24575, def %7; FPUaOffsetClass:%7

The order of instruction is very important: the COPY shall take place after the MOV_AB_ro!

But during Instruction scheduling, these two instructions have been permutated!

SU(18): NOP

SU(20): %6:fpuaoffsetclass = COPY %fab_roff0; FPUaOffsetClass:%6

SU(19): MOV_AB_ro @s1, %fab_roff0

SU(21): MOV_A_oo %6, def %5; FPUaOffsetClass:%6,%5

SU(22): MOVSUTO_A_iSLo 24575, def %7; FPUaOffsetClass:%7

I’m trying to understand why the Instruction scheduler has swapped these two instructions…

From the trace, it seems that the Successors of SU(19) is SU(21) and not SU(20) as expected…

What kind of information between the MOV_AB_ro and COPY could be missing?

How to define that the pred of SU(20) should be SU(19)?

SU(19): MOV_AB_ro @s1, %fab_roff0

preds left : 1

succs left : 1

rdefs left : 0

Latency : 0

Depth : 21

Height : 25

Predecessors:

SU(18): Ord Latency=1 Barrier

Successors:

SU(21): Ord Latency=0 Barrier

Pressure Diff :

Single Issue : false;

SU(20): %6:fpuaoffsetclass = COPY %fab_roff0; FPUaOffsetClass:%6

preds left : 0

succs left : 1

rdefs left : 0

Latency : 0

Depth : 0

Height : 25

Successors:

SU(21): Data Latency=0 Reg=%6

Pressure Diff : FPUaOffsetClass -1 FPUabOffsetClass -1

Single Issue : false;

Hi Dominique,

        MOV_AB_ro @s1, %fab_roff0
        %6:fpuaoffsetclass = COPY %fab_roff0; FPUaOffsetClass:%6

The MOV_AB_ro instruction there is only reading %fab_roff0 so there's
no reason for LLVM to think it can't be swapped with the following
copy. Should it maybe write that register instead?

If not, what's the architectural constraint that *does* prevent
movement? we might be able to help with how you need to describe that
to LLVM.

Cheers.

Tim.

Tim,

You are right, here is the definition of the MOV_AB_ro. No 'outs' !

  def MOV_AB_ro : CLPFPU_AB_ro<0b1000010011,
        (outs ),
        (ins FPUabRegisterOperand:$RegA, FPUabOffsetOperand:$OffsetB),
        ,,
        "mov_ab\t$RegA,$OffsetB","",
        ,NoItinerary> {
      let mayLoad = 1;
      let TSFlags{12} = 1;
      let DecoderMethod = "DecodeFPURO_moveInstruction";
    }

My problem is that my processor has _not_ load/store instructions.
It does _not_ have external memory, only a bank of 512 internal registers...
So data memory, stack frames and temps are allocated from these 512 registers.
It only has some MOV instructions, supporting two modes of addressing: register number addressing (r) or offset register related addressing (o).
These MOV instructions are sometime used to lower LOAD and sometime used to lower STORE.
So MOV instructions are a little ambiguous. Feel free to comment this architecture :wink:
To solve this ambiguity, I've introduced some LOAD PseudoInstroductions (with customInserters to MOV instructions).
These PseudoInstructions are used during Instruction Selection and Register Allocation. That works quite well.
But customInserters are called before the final scheduling. So the final scheduling is performed using the MOV instructions.
If well understood, I've to define the (outs ) for the destination of all my MOV definitions.
This will have many impact on my code (cusstomInserters, decoderMethods,...).
Could I use ImplictOuts to define dependency (less impact on the my code) ?

Thanks so much, Dominique T.