Changes in MachineInstruction/Peephole Optimizer?

Hi all,

The register allocator that I implemented is failing in the LLVM cvs version, but not in LLVM 1.1. The generated code fails a check in the x86 peephole optimizer:

llc: PeepholeOptimizer.cpp:128: bool <unnamed>::PH::PeepholeOptimize(llvm::Machi
neBasicBlock&, llvm::ilist_iterator<llvm::MachineInstr>&): Assertion `MI->getNum
Operands() == 2 && "These should all have 2 operands!"' failed.

I've tracked it down to a difference between LLVM cvs and LLVM 1.1 in PeepholeOptimizer.cpp:

In LLVM 1.1, PeepholeOptimizer.cpp: line 70:

  case X86::ADDri16: case X86::ADDri32:
   case X86::SUBri16: case X86::SUBri32:
   case X86::IMULri16: case X86::IMULri32:
   case X86::ANDri16: case X86::ANDri32:
   case X86::ORri16: case X86::ORri32:
   case X86::XORri16: case X86::XORri32:
     assert(MI->getNumOperands() == 3 && "These should all have 3 operands!");

While in the LLVM cvs version, PeepholeOptimizer, line 123:

  case X86::ADDri16: case X86::ADDri32:
   case X86::SUBri16: case X86::SUBri32:
   case X86::ANDri16: case X86::ANDri32:
   case X86::ORri16: case X86::ORri32:
   case X86::XORri16: case X86::XORri32:
     assert(MI->getNumOperands() == 2 && "These should all have 2 operands!");

So, 1.1 and cvs expect different number of operands for the same machine instruction. Do I have to change something in the register allocator to account for this? Any idea why its working in 1.1 but not in the CVS version?

Thanks
-Anshu

Anshu Dasgupta wrote:

Hi all,

The register allocator that I implemented is failing in the LLVM cvs version, but not in LLVM 1.1. The generated code fails a check in the x86 peephole optimizer:

llc: PeepholeOptimizer.cpp:128: bool <unnamed>::PH::PeepholeOptimize(llvm::Machi
neBasicBlock&, llvm::ilist_iterator<llvm::MachineInstr>&): Assertion `MI->getNum
Operands() == 2 && "These should all have 2 operands!"' failed.

I've tracked it down to a difference between LLVM cvs and LLVM 1.1 in PeepholeOptimizer.cpp:

In LLVM 1.1, PeepholeOptimizer.cpp: line 70:

case X86::ADDri16: case X86::ADDri32:
  case X86::SUBri16: case X86::SUBri32:
  case X86::IMULri16: case X86::IMULri32:
  case X86::ANDri16: case X86::ANDri32:
  case X86::ORri16: case X86::ORri32:
  case X86::XORri16: case X86::XORri32:
    assert(MI->getNumOperands() == 3 && "These should all have 3 operands!");

While in the LLVM cvs version, PeepholeOptimizer, line 123:

case X86::ADDri16: case X86::ADDri32:
  case X86::SUBri16: case X86::SUBri32:
  case X86::ANDri16: case X86::ANDri32:
  case X86::ORri16: case X86::ORri32:
  case X86::XORri16: case X86::XORri32:
    assert(MI->getNumOperands() == 2 && "These should all have 2 operands!");

So, 1.1 and cvs expect different number of operands for the same machine instruction. Do I have to change something in the register allocator to account for this? Any idea why its working in 1.1 but not in the CVS version?

This is due to a change in the requirements on the register allocators. In cvs register allocators need to make sure two-address instructions are correctly setup *and* remove the extra operand. This is very useful when debugging passes after the register allocator as they no longer need to check for the validity of two-adress instructions.

As an example in LLVM v1.1 you had:

A = B + C

and for the x86 you had to make sure that A and B are the same so you ended up with an instruction like:

D = D + C

with some compensation code added before it.

In cvs world you simply need to remove the first or the second operand and correctly mark D as a def and use. So you will end up with:

D += C

You may want to look at the TwoAddressInstructionPass which does that already for you as it transforms:

     A = B op C

to:

     A = B
     A op= C

Care must be taken if your allocator expects SSA form because the TwoAddressInstructionPass breaks it.