How to find real heap references using LLVM Stack Map

Hi all,

Let’s say I have an example like this

fn foo() {
heapRef2 = getHeapRef()
readStackMap()
}
fn main() {
heapRef1 = getHeapRef()
foo();
}

When we start to read stack map in foo(), our stack has two heap references (heapRef1 and heapRef2). From the LLVM stack map we can get the location of heapRef2, using let’s say rsp + offset.

But, When we consider about heapRef1, its’ offset is given with respect to its’ function(main) rsp or rbp. But at this moment those registers are pointed to foo()'s stack frame locations. There fore how do we get the location of heapRef1?

Highly appreciate your input on this.

Thank you,
Kavindu

This is out of scope for LLVM, but let me give you a high level summary.

Your GC needs to be able to walk all stack frames on the stack at point of suspension. This can be done by using libunwind, manually writing your own stack crawler, etc…

Once you have the ability to walk the stack, each return PC on the stack will correspond to a stack map entry in the stack map section. In this case, the return PC for the foo call will correspond to an entry in that section. All of the offsets in that entry refer to the frame corresponding to main. Your stack walker must be able to turn those into actual addresses.

For callee saved registers - which I think is the case you’re actually asking about - , you can either a) disable that by tweaking your copy of LLVM, or b) use the callee saved register information generated for eh_frame to translate CSRs into their spill locations. If you want to avoid dealing with this, look at the “no_callee_saved_registers” function attribute. It’s a very blunt hammer, but you can come back later and refine.

This is a pretty standard GC implementation technique. Any off the shelf GC you use should already be able to do this.

Philip

Hi Philip,

could you please explain what do you mean by return PC on the stack, is it the value in the rip (instruction pointer) after I call foo() from main()?