deleting or replacing a MachineInst

I’m writing a peephole pass and I’m done with the X86_64 instruction level detail work. But I’m having difficulty with the basic block surgery of replacing the old MachineInst.

The peephole pass gets called per MachineFunction and then iterates over each MachineBasicBlock and in turn over each MachineInst. When it finds an instruction which should be replaced, it builds a new instruction:

NewMI = BuildMI(*MBB, MBBI, MBBI->getDebugLoc(), TII->get(X86::opcode))

.addReg(X86::new_reg, kill)
.addImm(i);

This works and it correctly places the new instruction just before the old instruction in the assembly output. So far so good.

Now I have to remove the old instruction. But everything I try crashes LLVM, either immediately or eventually. Various incantations which haven’t worked.

MBB->remove_instr(OldMI);
OldMI->removeFromParent();
MBB.erase(OldMI);

I should add that there are flags

// %EFLAGS is getting copied automatically
// %RDX<imp-use,kill> is not getting copied (when it appears)

So, any advice to peephole adding/deleting or just replacing MachineInst’s ? I’ve looked over the other peephole optimizers. None seem to direct replacement. You’d think there’d be a MachineInst method.

Don’t you have a problem of iterator invalidation?
How do you loop through the block?
If you can post the full body of the loop (if not too large), or a simpler version that has the bug.

This seems a very natural approach but I probably am having a trouble with
the iterator invalidation. However, looking at other peephole optimizers
passes, I couldn't see how to do this:

#define BUILD_INS(opcode, new_reg, i) \
  BuildMI(*MBB, MBBI, MBBI->getDebugLoc(), TII->get(X86::opcode)) \
    .addReg(X86::new_reg, kill).addImm(i)

  for (MachineFunction::iterator MFI = MF.begin(), MFE = MF.end();
       MFI != MFE; ++MFI) {
    MachineBasicBlock* MBB = MFI;

    for (MachineBasicBlock::iterator MBBI = MBB->begin();
        MBBI != MBB->end(); ++MBBI) {
      MachineInstr *NewMI = NULL;
      OldMI = MBBI;

      // %EFLAGS<imp-def> is getting copied
      // %RDX<imp-use,kill> is not getting copied (when it appears)
      switch (OldMI->getOpcode()) {
      default: continue;
      // ....

      case X86::BT64ri8:
      case X86::BT32ri8:
      case X86::BT16ri8: {
          assert(OldMI->getNumOperands() >= 2);
          MachineOperand &Reg = OldMI->getOperand(0);
          MachineOperand &Imm = OldMI->getOperand(1);
          assert(Reg.isReg());
          assert(Imm.isImm());
          imm = Imm.getImm();

          if (imm >= 32)
            continue;

          kill = getKillRegState(Reg.isKill());

          switch (Reg.getReg()) {
          default: assert(false);
          case X86::RAX:
          case X86::EAX:
          case X86::AX:
            if (imm < 8) NewMI = BUILD_INS(TEST8i8, AL, 1 << imm);
            else if (imm < 16) NewMI = BUILD_INS(TEST8ri, AH, 1 << (imm -
8));
            else NewMI = BUILD_INS(BT32ri8, EAX, imm);
            break;

          // ...
          }
        }
        break;
      }

      // NewMI has been inserted before OldMI
      if (NewMI != NULL) {
// MBB->remove_instr(OldMI); // I've tried these (and others)
// OldMI->removeFromParent();
// MBB.erase(OldMI);
        NumX86Peepholes++;
      }
    }

The loop header needs to be modified, because MBBI will be invalidated when you remove the instruction:

    for (MachineBasicBlock::iterator MBBI = MBB->begin();
        MBBI != MBB->end(); ) {
      MachineInstr *NewMI = NULL;
      OldMI = MBBI;
      ++MBBI

The macro BUILD_INS has to be updated to use OldMI instead of MBBI.

Also I’m unsure why OldMI is not local to this loop body.

There are 11 BuildMI() functions in MachineInstrBuilder.h including four
using the iterator and one using an instruction. But I just don't think
that's it. The creation of the new instruction works fine (works fine with
OldMI as well) and the new instruction is present in the assembly output.

The problem is removing the old instruction correctly.

The loop header needs to be modified, because MBBI will be invalidated

when you remove the instruction:

So if I remove the old instruction with something like:

  MBB->remove_instr(OldMI);

I could just start the loop over or is there a more better way to
invalidate the MBBI iterator?
Basic blocks aren't that long (on average 4-6 instructions) and this
peephole isn't that common either.

I don’t understand anything of what you say, sorry :slight_smile:

I think I sent you a fix in my previous email, did you try it?

Mehdi

I made the change to the BuildMI() call. Again, I don’t think that matters.

#define BUILD_INS(opcode, new_reg, i)
BuildMI(*MBB, OldMI, MBBI->getDebugLoc(), TII->get(X86::opcode))
.addReg(X86::new_reg, kill).addImm(i)

I didn’t completely understand your other proposed change:
​for (MachineBasicBlock::iterator MBBI = MBB->begin();
MBBI != MBB->end(); ) {
MachineInstr *NewMI = NULL;
OldMI = MBBI;
++MBBI;

I think you’re saying with ++MBBI to step past the old instruction.
This seems faster speedwise but more of a hack than just restarting the loop.
But I’ll try it. It implies a certain knowledge of the iterator and MBB.

I accidentally touched MachineInstr.h and the rebuild is extracting its punishment.

I don’t see this is as a hack.
I believe it is a widespread pattern when iterating on a linked-list and having to delete elements.
You’ll see this pattern everywhere in LLVM.

Mehdi

Perhaps a better more neutral term is leaky abstraction.

The MBB iterator provides a documented abstraction. Look it up. Use it. Done. Do not worry about how it works. Then along comes a problem which isn’t within that abstraction’s abilities. So people come up with a solution/workaround/xxxx which then becomes an accepted pattern and they use that. This works.

The trouble is when other people who just read the documentation expect an abstraction and its methods to just work. And they can’t figure out why. We don’t really want to understand the implementation and search for patterns, albeit widespread. It should be in the abstraction. It should be in the documentation.

Now of course, I’m going to use this approach because it is faster than restarting the loop. However, if it’s so widespread and it does indeed work, why not add it to the abstraction? Existing code will continue to work but then this will be easier the next time for newbies to figure out in the future.

Just FYI, this isn’t anything specific to MBBs or any other LLVM construct. This problem, and the patterns for handling it, are common to all iterator access to C++ containers, including the STL ones.

—Owen