Live Register Spilling

Hi All,

I faced some problems while using the BuildMI().

Currently, i am trying to replace specific MI with a series of new MI.

I wrote a routine under the processFunctionAfterISel() to detect the targeted MI and replace it accordingly.

After using BuildMI() to perform my replacement, i realize there are unnecessary spilling and reloading of registers in the assembly generated.

By checking the llc debug output, i am suspecting that the virtual register states have been completely messed up.

This is because the spilling and reloading codes are only inserted at the register allocation phase, especially during the state of “Spilling live registers at the end of block”.

These spilling and reloading codes are messing up the assembly generated, and the behavior of the code generated is undetermined.

I would like to know is there anything i can do to fix the virtual register use-def relationship?

Or is there any standard procedure i should follow to handle the MachineOperands while using BuildMI()?

Any opinions or suggestion are welcomed.

Regards,

JC

This would be a lot easier to discuss having a concrete example, llc invocations etc.

It sounds like you are using RegAllocFast, have you tried using the optimized register allocators (= make sure you don’t have the optnone attribute on your function and are not passing -O0 to the tools).

  • Matthias

Hi Matthias,

Sorry for the late reply.

Yes, you are correct, I do have optnone attribute on my function.

I did pass -O0 to the tools.

For your information, my invocations are as below:

clang --target=mips-unknown-linux -mips32 test.c -emit-llvm -S

llc -O0 -march=mips -mcpu=mips32 test.ll -o test.s

Based on the generated .ll file, there is optnone attribute on the function, i am guessing this is due to the default optimization -O0 by clang if not specified.

As for the llc, i tried to invoke it with -O1,-O2,-O3. All of them resulted in failure during the phase “PROCESS IMPLICIT DEFS”

This message showed up:

Sorry about the previous message

This message showed up:

llc: /home/jc/Desktop/Project/For_Testing/llvm/lib/CodeGen/MachineRegisterInfo.cpp:366: llvm::MachineInstr* llvm::MachineRegisterInfo::getVRegDef(unsigned int) const: Assertion `(I.atEnd() || std::next(I) == def_instr_end()) && “getVRegDef assumes a single definition or no definition”’ failed.

I am assuming that i messed up the virtual register allocation when i am using BuildMI().

Also, i cannot invoke the clang as -O1 and so on as it will optimize my code and the generated .ll file will contain nothing other than the prologue and epilogue code.
Is there any information i can provide you so that we can discuss the issue further?

Chuan.

Sorry about the previous message

This message showed up:

llc: /home/jc/Desktop/Project/For_Testing/llvm/lib/CodeGen/MachineRegisterInfo.cpp:366: llvm::MachineInstr* llvm::MachineRegisterInfo::getVRegDef(unsigned int) const: Assertion `(I.atEnd() || std::next(I) == def_instr_end()) && “getVRegDef assumes a single definition or no definition”’ failed.

Earlier backend phases are in “Machine SSA” form (you can check MachineRegisterInfo::isSSA() for that), meaning virtual registers are only allowed to have a single definition.

  • Matthias

Running llc with ‘-verify-machineinstrs’ may tell you which instruction break the SSA form.

Ruiling

Hi All,

Thanks for the reply. I managed to identify and fixed a few errors in my implementation.

However, there are a few errors that i am not sure what is it indicating.
For starters, i think i should explain what i am trying to achieve.

I am actually working on MIPS backend to generate smaller set of MIPS Instructions compared to its existing supported instructions.
Currently, i am working on shifting instructions.

Take an example:
A typical mips sllv syntax goes in this manner:

sllv $reg1,$reg2,$reg3

The $reg3 contains the shifting amount. Only the LSB 5 bit will be used.
The $reg2 contains the data to be shifted.
The $reg1 contains the data after shifting is performed.

What i want to achieve is to expand sllv instruction to the following routine:

andi $reg3,$reg3,0x1f //To mask the 5 bit LSB shifting amount
#BB_1: beq $reg3,$zero,#BB_2 //Branch out from basic block if shifting amount is zero
sub $reg3,$reg3,1 //To subtract 1 from the shifting amount
sll $reg2,$reg2,1 //Shift by 1 bit
j #BB_1 //Branch back to the begining of the routine
#BB_2: addu $reg1,$reg2,$zero //Transfer the completed shift data to the original destination register

Since you guys mentioned that the MI are represented in MachineSSA form, i imagined my routine represented by virtual registers would look something like this:

andi $vreg3,$vreg3,0x1f
#BB_1: beq $vreg3,$zero,#BB_2
sub $vreg3,$vreg3,1
sll $vreg2,$vreg2,1
j #BB_1
#BB_2: addu $vreg1,$vreg2,$zero

With the -verify-machineinstrs invoked together with llc, the errors i encountered are as follow:

  1. *** Bad machine code: Explicit definition marked as use ***
  • function: main
  • basic block: BB#1 entry (0x546e600)
  • instruction: SUB
  • operand 0: %vreg3

2.*** Bad machine code: Explicit definition marked as use ***

  • function: main
  • basic block: BB#1 entry (0x546e600)
  • instruction: SLL
  • operand 0: %vreg2

3.*** Bad machine code: Non-terminator instruction after the first terminator ***

  • function: main
  • basic block: BB#1 entry (0x4911600)
  • instruction: SUB
    First terminator was: BEQ %vreg3, %ZERO, <BB#2>, %AT; GPR32:%vreg3

4.*** Bad machine code: Non-terminator instruction after the first terminator ***

  • function: main
  • basic block: BB#1 entry (0x4632600)
  • instruction: SLL
    First terminator was: BEQ %vreg3, %ZERO, <BB#2>, %AT; GPR32:%vreg3

Here are a few questions i would like to ask:
1.Assuming that $vreg1,$vreg2 and $vreg3 are defined before #BB_1, based on my suggested routine, have i broke the integrity of the MachineSSA form?

2.What is error 1 and 2 trying to tell me? I believe it has something to do with the define and kill of the virtual register states.

3.I am actually not sure about data-flow analysis.I did looked through MachineOperand class. Inside there shows the register states such as IsDef,IsDead,IsKill and etc. What are the definition for those register states? I am not sure what to do with the states, so i didn’t really pass the virtual register states when i invoked BuildMI() to insert my desired MI. Are there any recommended sources or documentions i can look into so that i could learn more about data-flow analysis or understand the definition for each of the register states?

4.For error 3 and 4, does it mean i should group the other instructions after branch into another basic block?

5.For #BB_1, i actually notice another error. If i add more than one successor for it, the llvm actually complains unconditional branch does not have exactly one successor. I had to seperate the jump instruction into another basic block. Is it wrong to have to 2 branching instructions in one basic block?

6.My implementation seems to work only for -O1,-O2,-O3 invocation of llc. The -O0 invocation results in a lot of unecessary spilling and reloading code and causes the behavior of my code to be undefined. Matthias did mentioned that it could be due to the register allocator of O0. What should i look into so that my approach could work for all register allocator?

I might asking for too much, but to be honest, i am actually very new to compiler study. I had to admit i need to study more about compiler techniques.
Can you guys recommend me books to study about compiler techniques?
If you guys find troubled to answer my questions, can you guys at least recommend me some sources or documentations so that i can look into myself?

Thanks in advance.
Chuan

Hi All,

Thanks for the reply. I managed to identify and fixed a few errors in my implementation.

However, there are a few errors that i am not sure what is it indicating.
For starters, i think i should explain what i am trying to achieve.

I am actually working on MIPS backend to generate smaller set of MIPS Instructions compared to its existing supported instructions.
Currently, i am working on shifting instructions.

Take an example:
A typical mips sllv syntax goes in this manner:

sllv $reg1,$reg2,$reg3

The $reg3 contains the shifting amount. Only the LSB 5 bit will be used.
The $reg2 contains the data to be shifted.
The $reg1 contains the data after shifting is performed.

What i want to achieve is to expand sllv instruction to the following routine:

andi $reg3,$reg3,0x1f //To mask the 5 bit LSB shifting amount
#BB_1: beq $reg3,$zero,#BB_2 //Branch out from basic block if shifting amount is zero
sub $reg3,$reg3,1 //To subtract 1 from the shifting amount
sll $reg2,$reg2,1 //Shift by 1 bit
j #BB_1 //Branch back to the begining of the routine
#BB_2: addu $reg1,$reg2,$zero //Transfer the completed shift data to the original destination register

Since you guys mentioned that the MI are represented in MachineSSA form, i imagined my routine represented by virtual registers would look something like this:

andi $vreg3,$vreg3,0x1f
#BB_1: beq $vreg3,$zero,#BB_2
sub $vreg3,$vreg3,1
sll $vreg2,$vreg2,1
j #BB_1
#BB_2: addu $vreg1,$vreg2,$zero

With the -verify-machineinstrs invoked together with llc, the errors i encountered are as follow:

  1. *** Bad machine code: Explicit definition marked as use ***
  • function: main
  • basic block: BB#1 entry (0x546e600)
  • instruction: SUB
  • operand 0: %vreg3

That usually means that when you created your instruction and added the register operand to it you forgot to set the RegisterDefined state.

2.*** Bad machine code: Explicit definition marked as use ***

  • function: main
  • basic block: BB#1 entry (0x546e600)
  • instruction: SLL
  • operand 0: %vreg2

Ditto.

3.*** Bad machine code: Non-terminator instruction after the first terminator ***

  • function: main
  • basic block: BB#1 entry (0x4911600)
  • instruction: SUB
    First terminator was: BEQ %vreg3, %ZERO, <BB#2>, %AT; GPR32:%vreg3

That means you inserted your instructions after the jump of the basic block. Use getFirstTerminator as insertion point.

4.*** Bad machine code: Non-terminator instruction after the first terminator ***

  • function: main
  • basic block: BB#1 entry (0x4632600)
  • instruction: SLL
    First terminator was: BEQ %vreg3, %ZERO, <BB#2>, %AT; GPR32:%vreg3

Ditto.

Hi All,

Thanks for the reply. I managed to identify and fixed a few errors in my implementation.

However, there are a few errors that i am not sure what is it indicating.
For starters, i think i should explain what i am trying to achieve.

I am actually working on MIPS backend to generate smaller set of MIPS Instructions compared to its existing supported instructions.
Currently, i am working on shifting instructions.

Take an example:
A typical mips sllv syntax goes in this manner:

         sllv $reg1,$reg2,$reg3

The $reg3 contains the shifting amount. Only the LSB 5 bit will be used.
The $reg2 contains the data to be shifted.
The $reg1 contains the data after shifting is performed.

What i want to achieve is to expand sllv instruction to the following routine:

                 andi $reg3,$reg3,0x1f //To mask the 5 bit LSB shifting amount
#BB_1: beq $reg3,$zero,#BB_2 //Branch out from basic block if shifting amount is zero
                 sub $reg3,$reg3,1 //To subtract 1 from the shifting amount
                 sll $reg2,$reg2,1 //Shift by 1 bit
                 j #BB_1 //Branch back to the begining of the routine
#BB_2: addu $reg1,$reg2,$zero //Transfer the completed shift data to the original destination register

Since you guys mentioned that the MI are represented in MachineSSA form, i imagined my routine represented by virtual registers would look something like this:

                andi $vreg3,$vreg3,0x1f
#BB_1: beq $vreg3,$zero,#BB_2
                sub $vreg3,$vreg3,1
                sll $vreg2,$vreg2,1
                j #BB_1
#BB_2: addu $vreg1,$vreg2,$zero

Hi Chuan,

In your example, you should not try to define $vreg3 many times. In SSA form, each value should be defined only once.
You need to create some new virtual registers and make sure each virtual register will be defined only once.
In your example you may need PHI node. Which is an important concept in SSA. You may need to google it to teach yourself on this.
And you should not insert normal arithmetic instructions between two branch/jump instructions. In your example, there are sub/sll between beq/j. which is invalid.
A ‘basic block’ should only terminate(br/jump) at the last one or two instructions.
llc has many useful options besides verify instructions. It also support printing machine IR before/after each pass which is also very useful. Try ‘llc --help’

Ruiling

Hi All,

Thank you all for your advices. (Especially Ruiling, your phi nodes hints solves pretty much all of my problems!)
Currently,i am facing issue with the optimization.

One of my redefined basic block is as below:

#BB_2: sub $vreg3,$vreg3,1
sll $vreg2,$vreg2,1
j #BB_1

The implementation works fine for O0 invocation.
But for O1,O2,O3, the instruction is reordered.
The j instruction is reordered to execute before sll:

#BB_2: sub $vreg3,$vreg3,1
j #BB_1
sll $vreg2,$vreg2,1

I am guessing the optimization reordered sll in the jump delay slot, and the instruction in jump delay slot is assumed to be executed everytime?
Is there a way to force the j and sll to be in-order as shown in my previous basic block even with the optimization?
Or llc accepts parameters to turn off mips jump delay slot?

Chuan.

Hi All,

I found few ways that could solve my problem:
1.I found the -disable-mips-delay-filler from the llc hidden options. This option will not reorder my instructions below the branch delay slots but fills it with nop’s
2.If i were to partition the jump instruction into another basic block as below, it will not reorder my instruction into the delay slot too.

#BB_2: sub $vreg3,$vreg3,1
sll $vreg2,$vreg2,1

#BB_3: j #BB_1

The 1st method seems to be viable, but it will generate nop’s which increases the code size. Plus, i have to invoke it manually with llc everytime, else it won’t work.
The 2nd method seems to be more automated compared to 1st method, but its more like a quick hack, rather than a formal way to solve the reordering issue.

Is there any other way to disable the reordering? Maybe through the coding instead of the invocation options of llc?

Chuan.