Figuring out return address of a call

Hi folks,

I'm trying to figure out the return address of a function in an LLVM
pass, i.e., the byte address right after the end of the call instruction
(so that I can initialize a global variable with the return address of a
function for a sanity check). Due to some other constraints, I have to
run this pass in somewhere in the midend.

At a high level, I want to find the address after a call instruction (my
main target is x86_64 for now) at runtime, see the two examples below:

100: e8 ff ff ff ff callq func
105: .marker

100: ff d0 callq *%rax
102: .marker

My approach is to find call addresses through a function pass, split the
basic block *after* the call instruction, then generate a BlockAddress
as follows:

if (auto CL = dyn_cast<CallInst>(&*I)) {
  BasicBlock *callblock = (*CL)->getParent();
  BasicBlock *postblock =
    callblock->splitBasicBlock((*CL)->getNextNode());
  BlockAddress *retaddr = BlockAddress::get(postblock);
  ...
}

This works well except that the BlockAddress is slightly off. I run into
the problem that during code generation, my BlockAddress is moved past
the instructions that store arguments. E.g., if the function returns an
argument, %rax is first spilled somewhere and my BlockAddress points to
the end of, e.g., the movq instruction.

Is there a better way to retrieve the address right after the call
instruction (i.e., before the return value is stored)?

Thanks,
Mathias

Hi Mathias,

I don't think you can do this using block addresses (since, as you
said, the BlockAddress will be slightly off in most cases).

I'd suggest (ab)using the patchpoint[1] intrinsic for this purpose.
Instead of calling

@foo(i32 1, float 2.0)

you'll have to instead do something like

@llvm.experimental.patchpoint(i64 <id>, i32 5, @foo, i32 2, i32 1, float 2.0)

which will get lowered to a normal 5 byte call, and to an entry in the
__llvm_stackmaps section (which will state the return PC). It may be
difficult to tie back a given __llvm_stackmaps entry to a specific
call in the IR (the ID is not sufficient since duplicated patchpoint
calls will share the same ID), but you should be able to reify
whatever information you need to associate with a given return site as
extra "live values" to patchpoint.

However, using patchpoint in mid level IR will inhibit inlining.

What are you actually trying to do with this RPC information?

I'm trying to figure out the return address of a function in an LLVM
pass, i.e., the byte address right after the end of the call instruction
(so that I can initialize a global variable with the return address of a
function for a sanity check). Due to some other constraints, I have to
run this pass in somewhere in the midend.

At a high level, I want to find the address after a call instruction (my
main target is x86_64 for now) at runtime, see the two examples below:

100: e8 ff ff ff ff callq func
105: .marker

100: ff d0 callq *%rax
102: .marker

My approach is to find call addresses through a function pass, split the
basic block *after* the call instruction, then generate a BlockAddress
as follows:

if (auto CL = dyn_cast(&*I)) {
BasicBlock *callblock = (*CL)->getParent();
BasicBlock *postblock =
callblock->splitBasicBlock((*CL)->getNextNode());
BlockAddress *retaddr = BlockAddress::get(postblock);
...
}

This works well except that the BlockAddress is slightly off. I run into
the problem that during code generation, my BlockAddress is moved past
the instructions that store arguments. E.g., if the function returns an
argument, %rax is first spilled somewhere and my BlockAddress points to
the end of, e.g., the movq instruction.

Btw, there is no guarantee that the store of %RAX will be the only
instruction between callbBlock and postBlock -- I know the mid level
optimizer is conservative around blocks whose address has been taken,
but at the very least the register allocator can emit arbitrary spills
/ fills there.

[1]: http://llvm.org/docs/StackMaps.html#llvm-experimental-patchpoint-intrinsic

-- Sanjoy

Hi Sanjoy,

Thanks for the quick reply, that's very helpful.

What are you actually trying to do with this RPC information?

I'm working on an optimized/fast shadow stack to protect against ROP
attacks. Most of the instrumentation could be done in the backend but
some of the analysis needs to be done at the midend.

I feared someone would point me towards intrinsics. I'll try to either
abuse the patchpoints as you suggested (from a first glance it looks
feasible) or split my pass into two stages where I store some
information in the midend and then inject the code directly in the
backend to get around this "moving addresses" problem (which is likely
the cleaner approach). I'll have to explore what works better.

Btw, there is no guarantee that the store of %RAX will be the only
instruction between callbBlock and postBlock -- I know the mid level
optimizer is conservative around blocks whose address has been taken,
but at the very least the register allocator can emit arbitrary spills
/ fills there.

Yes, I tried to come up with a simple example. If the function returns a
struct there's a whole bunch of spilling going on :slight_smile:

Thanks again,
Mathias