llvm-stress crash

Hi,

Using llvm-stress, I got a crash after Post-RA pseudo expansion, with machine verifier.

A 128 bit register

%vreg233:subreg_l32<def,read-undef> = LLCRMux %vreg119; GR128Bit:%vreg233 GRX32Bit:%vreg119

gets spilled:

%vreg265:subreg_l32<def,read-undef> = LLCRMux %vreg119; GR128Bit:%vreg265 GRX32Bit:%vreg119
ST128 %vreg265, <fi#10>, 0, %noreg; mem:ST16[FixedStack10](align=8) GR128Bit:%vreg265

-> regalloc

%R5L<def> = LLCRMux %R6L, %R4Q<imp-def>
ST128 %R4Q<kill>, <fi#10>, 0, %noreg; mem:ST16[FixedStack10](align=8)

-> pseudo expansion

%R5L<def> = LLCR %R6L
STG %R4D<kill>, %R15D, 200, %noreg; mem:ST16[FixedStack7](align=8)
STG %R5D<kill>, %R15D, 208, %noreg; mem:ST16[FixedStack7](align=8)

*** Bad machine code: Using an undefined physical register ***
- function: autogen_SD29355
- basic block: BB#19 CF257 (0x4cb6b00)
- instruction: STG
- operand 0: %R4D<kill>

So, it seems that vreg233 had only the low 64 bits defined to begin with, the upper 64 are undefined.
Then this interval is spilled with the ST128 pseudo. After expansion, both 64 bit parts are spilled,
where naturally the upper half is undefined.

I wonder how this should be resolved. It seems there should be some flag (read-undef?) on the ST128 or something after Virtual Register Rewriter.

Or is the target actually supposed to check liveness of register during post-RA pseudo expansion?

/Jonas

bin/llc -mtriple=s390x-linux-gnu -mcpu=z13 -verify-machineinstrs

BMC_UsingUndefPhysReg.ll (18.3 KB)

Yes, in the sense that you shouldn’t introduce uses of undefined physical registers. This seems like somewhat of a special case (in that you’re taking an input register and breaking it up into subregisters explicitly). Avoiding inserting the store here also seems like a useful optimization. That having been said, you can force the register to be defined. I recall doing something similar in the PowerPC backend in the CR-bit spilling code – in lib/Target/PowerPC/PPCRegisterInfo.cpp there’s this: BuildMI(MBB, II, dl, TII.get(TargetOpcode::KILL), getCRFromCRBit(SrcReg)) .addReg(SrcReg, getKillRegState(MI.getOperand(0).isKill())); -Hal

The annoying problem here is that we have no way to mark a register as partially undef in LLVM, so when you break bigger uses apart you don't know that one of the halves is completely undefined now.
Mindlessly adding read-undef to both register is wrong however (as that semantically means you can do whatever with the use like replacing it with a completely different register). As recomputing liveness at this point to know whether one of the halfes is completely undefined is expensive and complicated I added an exception to the MachineVerifier for this which most other targets should be using:

        // If there is an additional implicit-use of a super register we stop
        // here. By definition we are fine if the super register is not
        // (completely) dead, if the complete super register is dead we will
        // get a report for its operand.

basically you leave a implicit use of the super register around to indicate that your original/higher level operation is about that super register and the verifier shouldn't care if a subregister is undefined.

- Matthias

Hi,

Thanks for explanations!

Yes, in the sense that you shouldn't introduce uses of undefined physical registers. This seems like somewhat of a special case (in that you're taking an input register and breaking it up into subregisters explicitly). Avoiding inserting the store here also seems like a useful optimization. That having been said, you can force the register to be defined. I recall doing something similar in the PowerPC backend in the CR-bit spilling code -- in lib/Target/PowerPC/PPCRegisterInfo.cpp there's this:

  BuildMI(MBB, II, dl, TII.get(TargetOpcode::KILL),
          getCRFromCRBit(SrcReg))
          .addReg(SrcReg, getKillRegState(MI.getOperand(0).isKill()));

-Hal

Interesting - so you are using a KILL instruction to actually mark SrcReg as live via the first def-operand? I might have thought an IMPLICIT_DEF would make more sense at least in terms of the instruction name.

Hi,

Thanks for explanations!

Yes, in the sense that you shouldn't introduce uses of undefined physical registers. This seems like somewhat of a special case (in that you're taking an input register and breaking it up into subregisters explicitly). Avoiding inserting the store here also seems like a useful optimization. That having been said, you can force the register to be defined. I recall doing something similar in the PowerPC backend in the CR-bit spilling code -- in lib/Target/PowerPC/PPCRegisterInfo.cpp there's this:

  BuildMI(MBB, II, dl, TII.get(TargetOpcode::KILL),
          getCRFromCRBit(SrcReg))
          .addReg(SrcReg, getKillRegState(MI.getOperand(0).isKill()));

-Hal

Interesting - so you are using a KILL instruction to actually mark SrcReg as live via the first def-operand? I might have thought an IMPLICIT_DEF would make more sense at least in terms of the instruction name.

I don't think that IMPLICIT_DEF works that late in the pipeline (ProcessImplicitDefs.cpp gets rid of them?). I recall using KILL because this is how ExpandPostRAPseudos.cpp lowers subreg-to-reg.

  -Hal