How to prevent instruction being deleted by DeadMachineInstructionElim

As part of a calling convention I’m implementing I need to install a pointer to a stack object as return address. To do this I call getCopyToReg(..., RISCV::X1, Ptr)
After instruction scheduling this looks like this (I included the register clearing instruction PseudoCClear and call instruction PseudoUCCALL for context):

%22:gpcr = CIncOffsetImm %stack.0, 0
$c1 = COPY %22:gpcr
PseudoCClear <regmask $c0 $c1 $c2 $c8 $x0 $x1 $x2 $x8>, implicit-def $c2
PseudoUCCALL target-flags(riscv-ccall) @b, <regmask $c0 $c1 $x0 $x1>, implicit-def $c2

After which the two instructions installing the pointer get deleted:

DeadMachineInstructionElim: DELETING: $c1 = COPY %22:gpcr
DeadMachineInstructionElim: DELETING: %22:gpcr = CIncOffsetImm %stack.0, 0

How can I prevent the deletion of these instructions?

You need to get the register added as an implicit-use of the call instruction. For RISC-V this is the code that does it for normal calls: llvm-project/RISCVISelLowering.cpp at 3605ebca32fc42d01a54eea00bb4bf8049dca214 · llvm/llvm-project · GitHub

Thanks for your answer!
I tried adding the register as an argument, but this adds an implicit-def to the PseudoCClear.

PseudoCClear <regmask $c0 $c1 $c2 $c8 $x0 $x1 $x2 $x8>, implicit-def $c1, implicit-def $c2
PseudoUCCALL target-flags(riscv-ccall) @b, <regmask $c0 $c1 $x0 $x1>, implicit $c1, implicit-def $c2

Adding or removing $c1 from the PseudoCClear register mask does not change this.
The end result is the same, my instructions get deleted.

For reference: this is the definition of PseudoCClear

let Predicates = [HasCheri, IsCapMode], hasSideEffects = 1 in
def PseudoCClear : Pseudo<(outs), (ins), [(riscv_clear_regs)]>;

I don’t think there’s enough to go on here. It all depends on how you tried to add the register and how both of those instructions get generated.

The register is added like you suggested:
Ops.push_back(DAG.getRegister(RISCV::C1, PtrVT));

PseudoCClear corresponds to the RISCVISD::CLEAR_REGS SDNode type. It gets generated in ISelLowering as follows:

SmallVector<SDValue, 8> Ops;
const uint32_t *ClearMask = getCallClearMask(ArgLocs);
Ops.push_back(Chain);
Ops.push_back(DAG.getRegisterMask(ClearMask));
if (Glue.getNode()) Ops.push_back(Glue);
SDVTList NodeTys = DAG.getVTList(MVT::Other, MVT::Glue);
Chain = DAG.getNode(RISCVISD::CLEAR_REGS, DL, NodeTys, Ops);
Glue = Chain.getValue(1);

To emit PseudoUCCALL corresponds to the RISCVISD::UNINIT_CALL node type, which drop-in replaces RISCVISD::CALL:
Chain = DAG.getNode(RISCVISD::UNINIT_CALL, DL, NodeTys, Ops);

The PseudoCClear and PseudoUCCALL are pattern matched by their respective target specific SDNodes:

def SDT_RISCVCapCall        : SDTypeProfile<0, -1, [SDTCisVT<0, CLenVT>]>;
def riscv_uninit_call     : SDNode<"RISCVISD::UNINIT_CALL", SDT_RISCVCapCall,
                                   [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
                                    SDNPVariadic]>;
let Predicates = [HasCheri, IsCapMode, IsPureCapABI] in {
let isCall = 1 in
def PseudoUCCALL : Pseudo<(outs), (ins cap_call_symbol:$func), []>;
def : Pat<(riscv_uninit_call tglobaladdr:$func),
          (PseudoUCCALL tglobaladdr:$func)>;
def : Pat<(riscv_uninit_call texternalsym:$func),
          (PseudoUCCALL texternalsym:$func)>;
} // Predicates = [HasCheri, IsCapMode, IsPureCapABI]

def SDT_RISCVClearRegs : SDTypeProfile<0, 0, []>;
def riscv_clear_regs : SDNode<"RISCVISD::CLEAR_REGS", SDT_RISCVClearRegs,
[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue, SDNPVariadic]>;
let Predicates = [HasCheri, IsCapMode], hasSideEffects = 1 in
def PseudoCClear : Pseudo<(outs), (ins), [(riscv_clear_regs)]>;

It leaves the DAG looking like this after ISel

      t85: iFATPTR128 = CIncOffsetImm TargetFrameIndex:iFATPTR128<0>, TargetConstant:i64<0>
    t35: ch = CopyToReg t31, Register:iFATPTR128 $c1, t85
  t38: ch,glue = PseudoCClear RegisterMask:Untyped, t35
  t40: ch,glue = PseudoUCCALL TargetGlobalAddress:iFATPTR128<void () addrspace(200)* @b> 0 [TF=17], Register:iFATPTR128 $c1, RegisterMask:Untyped, t38, t38:1

Please let me know if there is any other useful information I can provide.

This looks weird. The rough equivalent for a normal call is:

  t6: ch,glue = CopyToReg t4, Register:i32 $x10, t14
  t9: ch,glue = PseudoCALL TargetGlobalAddress:i32<ptr @bar> 0 [TF=2], RegisterMask:Untyped, Register:i32 $x10, t6, t6:1

And there are some noticeable differences:

  • No Register operand, and code to add it is missing from the C++ code you pasted.
  • There’s no glue between the call and the CopyToReg. I think you just pass an empty SDValue() to getCopyToReg and it’ll produce one with glue that the call can then use.

I’m not sure if either of these are actually causing your problem (I really can’t explain how the register became a def, I’ve never seen that before) but it’s probably worth trying to be as similar to a normal call as possible.

What ended up solving the problem was:

  • Adding $c1 as an argument to PseudoCClear;
  • Annotating PseudoUCCALL with Uses = [C1].

This however was not what I wanted, since PseudoCClear should not touch $c1, and also doesn’t use it. Giving it as an argument would be kind of confusing.

I ended up reworking the code so the copy for $c1 is emitted after PseudoCClear, which also solved the problem.

Many thanks for your help TNorthover!