Questions about live-ins, live-outs

Hi all,

I’m trying to start using the LiveOuts-exporting feature of the new stackmap/patchpoint intrinsics, and I’m running into some issues where I seem to be getting the wrong set of registers. This is my first foray into CodeGen, so my understanding of everything I’m about to say is pretty hazy, but I believe that the stackmap/patchpoint functionality simply tacks on the existing LiveOuts that are already being calculated, which makes me think the issues I’m about to describe are about the LiveOuts calculation rather than being stackmap/patchpoint-specific.

I’m running into an issue where I get some registers in the LiveOuts set that are clearly not live after the call; specifically, I’m sometimes get that %rdi or %rsi are live after a function call which uses a normal C-calling convention. Or maybe I’m misunderstanding the meaning of LiveOuts; I’m assuming that they refer to registers that contain values that will be used later.

Here’s a boiled-down version of the IR:

define void @f(i64) {
%2 = load %0** @n, align 8
%3 = getelementptr inbounds %0* %2, i64 0, i32 0
%4 = call i1 @foo()
br i1 %4, label %9, label %5

; :5 ; preds = %5, %1
%6 = phi i64* [ %3, %5 ], [ %3, %1 ]
%7 = tail call i64 (i64, i32, i8*, i32, …)* @llvm.experimental.patchpoint.i64(i64 116, i32 13, i8* inttoptr (i64 12346 to i8*), i32 3, i64 %0, i64 1, i64* %6)
%8 = call i1 @foo()
br i1 %8, label %9, label %5

; :9 ; preds = %5, %1
ret void
}

And here’s the generated assembly listing:

f:
pushq %r14
pushq %rbx

pushq %rax

movq %rdi, %r14

movq n(%rip), %rbx
jmp .LBB0_2
.align 16, 0x90
.LBB0_1:
movl $1, %esi
movq %r14, %rdi
movq %rbx, %rdx
movabsq $12346, %r11

callq *%r11
.LBB0_2:
callq foo
testb $1, %al
je .LBB0_1
.LBB0_3:
addq $8, %rsp
popq %rbx
popq %r14
retq

It’s claiming that %rdi is live after the “callq %r11” at the end of LBB0_1, which based on my understanding of what the live outs are supposed to represent shouldn’t be true. Looking into it, it seems to be because it considers the live outs of LBB0_1 to include RDI; it seems to determine that because RDI is included in the live-ins of LBB0_2. Again, assuming my understanding of what live-ins are supposed to be is correct, that seems wrong.

Tracing through when RDI gets added to that set, it’s during block tail-merging. Specifically, it’s in the BranchFolder::MaintainLiveIns(), which does a forward propagation of the BB that I haven’t completely dug into; it’s working on this basic block:

BB#0: derived from LLVM BB %1
Live Ins: %RDI %R14 %RBX
PUSH64r %R14, %RSP, %RSP; flags: FrameSetup
PROLOG_LABEL <MCSym=.Ltmp0>
PUSH64r %RBX, %RSP, %RSP; flags: FrameSetup
PROLOG_LABEL <MCSym=.Ltmp1>
PUSH64r %RAX, %RSP, %RSP; flags: FrameSetup
PROLOG_LABEL <MCSym=.Ltmp2>
%R14 = COPY %RDI
%RBX = MOV64rm %RIP, 1, %noreg, ga:n, %noreg; mem:LD8[@n]
CALL64pcrel32 ga:foo, , %RSP, %RSP, %AL
TEST8ri %AL, 1, %EFLAGS
JE_4 <BB#2>, %EFLAGS
Successors according to CFG: BB#3(16) BB#2(16)

The register scavanger determines that rdi is a live out of this block [that doesn’t seem to be the case?] so the tail-merger adds it as a live-in of what will become LBB0_2 (seems weird as well, since they’re at exit of BB#0 doesn’t correspond to the entry of LBB0_2). At this point I’m pretty confused – am I misunderstanding what LiveOuts are supposed to be? Or does BranchFolder::MaintainLiveIns do the wrong thing? Or is the register scavanger calculating the liveness differently? I’m not that clear on how the liveness information is used at different points in the pipeline, and clearly something isn’t adding up for me, so if anyway could shed some light on this I’d much appreciate it…

Kevin

Did some more digging, and it seems like the issue isn’t with the tail merging or the register scavenger; it seems to be because the “%R14 = COPY %RDI” instruction is missing a kill flag on rdi. Tracing through, it gets removed in LiveRangeCalc::extendToUses, which has a comment “Clear all kill flags. They will be reinserted after register allocation by LiveIntervalAnalysis::addKillFlags().”

It looks like, though, that since this is a physical register, it doesn’t get added back during addKillFlags because that only looks at virtual registers. I tried changing the logic so that extendToUses() only removes the kill flags from virtual registers, which fixes my particular case, but it causes some test failures so I guess is probably not the right way to go. Does anyone have advice on how to proceed? Otherwise I guess I’ll just file a bug and move on.

Kevin