Question about an unusual jump instruction

Dear all,

I'm working on an exploratory backend on llvm. In the instruction set I'm using I have an instruction (called DECJNZ) that decrements a register and, if the decremented value is not zero, jumps (with a relative jump) to a given offset.

I've described in tablegen this instruction as follow:

def DECJNZ : Instruction {
let Namespace = "MyTarget";
let OutOperandList = (outs GprRegs:$R0);
let InOperandList = (ins GprRegs: $R1, imm16:$dest);
let AsmString = "DECJNZ $R0, $dest";
let isBranch = 1;
let isTerminator = 1;
let Constraints = "$R1 = $R0";
let Defs = [SR];
}

I would like to create an optimization pass to make countable loops faster by using this instruction.

The simplest loop that I would like to optimize is like:

//////////////////////////
int i = a;
do {
   // loop body
   --i;
} while (i != 0);
//////////////////////////

After code selection I've something like:

BB0:
   %vreg0<def> = COPY %R0; // R0 contains 'a'
   J <#BB1>
BB1:
   %vreg1<def> = PHI %vreg0, <#BB0>, %vreg3, <#BB3>
   J <#BB2>
BB2:
   // loop body
BB3:
   %vreg3<def> = ADDI %vreg1<kill>, 1
   CMPNE %vreg3, 0, %SR<implicit,def>
   JNZ <#BB1>
   J <#BB4>
BB4:
   // end

With the optimization pass I replace the decrement, comparison and conditional jump with the DECJNZ. The resulting code will be:

BB0:
   %vreg0<def> = COPY %R0; // R0 contains 'a'
   J <#BB1>
BB1:
   %vreg1<def> = PHI %vreg0, <#BB0>, %vreg3, <#BB3>
   J <#BB2>
BB2:
   // loop body
BB3:
   %vreg3<def> = DECJNZ %vreg1<kill>, <#BB1>, %SR<implicit,def>
   J <#BB4>
BB4:
   // end

A first problem was related to PHIElimination, while eliminating the PHI-node a copy was generated before the DECJNZ, because it's a terminator instruction, but the copy should use the value defined by the DECJNZ.

To solve this problem I wrote a preprocess pass which is run just before PHIElimination and change the opcode of PHIs that have at least one source alue generated by a DECJNZ. In this way it is ignored by the PHIElimination passes and then a pass run just after PHIElimination that 'lowers' in a custom way the marked PHIs and updates the information about live variables.

With some tests I had a second problem generated by the spilling of the register used as loop counter. A store instruction is generated after the definition of the value, so is inserted between the DECJNZ and J.

I think that with another pass I can try to manually move the spill-store instruction at the beginning of the destination basic block, but I think it's not enough to preserve the semantics of the code.

Is my approach correct? Does it exist a cleaner and more elegant way to support this kind of instruction? I tried to look for a similar instruction in other targets, but I've found nothing...

Thanks in advance for future replies.

Regards,

Michele Scandale

See PPCCTRLoops.cpp in the PPC backend.

-Eli

I took a quick look to PPCCTRLoops.cpp, but I don't think it's the same of my case. The instruction I have defines explicitly a GPR register, while the BDNZ defines and uses an implicit dedicated register. I based my optimization pass on what seen in Hexagon target too, but due to the fact that I have an explicit GPR as loop counter, my instruction defines a virtual register and the fact that the instruction is also a terminator creates all the problems I listed in my previous mail.

Regards,
Michele Scandale

Hi Michele,

As you've already figured out on your own, LLVM's register allocator passes don't support terminators that write virtual registers.

You could try forming your DECJNZ instructions after register allocation.

/jakob