Issue with inline assembly, function inlining, and position independent code


I’m running some performance experiments on a x86-64 linux system, where I’ve modified LLVM to reserve a register, and I’d like to use that register in my code. Currently, I’m using %r12d, which is callee save, so I don’t need to worry about compatibility with existing libraries or system calls. For security reasons, the generated binaries need to be position independent.

To access the register, I need to use inline assembly; either for all dependent computations, or just for moving the register value into a C++ variable. Since the latter results in a second register allocation, I’m just doing assembly for everything. So, I have a function foo that contains some inline assembly.

The problem is that its input arguments can sometimes be pointers to constant data, and after function inlining, the result may no longer link because R_X86_64_32S relocations are not permitted. I need to get LLVM to introduce a load-effective address (lea) to move the pointer into a register, if applicable, but I haven’t found a method that both works and is performant. I could modify LLVM locally to make this happen, but this seems pretty deep in the X86 backend and not straightforward?

  1. Make the input argument volatile or introduce a volatile variable. Works, but adds two moves to and from the stack.
  2. Introduce a register variable equal to the input argument. Results in an error with C++17, and doesn’t seem to work otherwise.
  3. Only use inline assembly to move %r12d into a C variable. Mostly works, but the variable isn’t incremented afterwards, requires an extra register, and adds multiple extra moves.
  4. Use the local register variable extension to force a C variable to %r12d. Not supported by Clang/LLVM.



Trying to look at your example. Why is the constraint on the pointer variable to the inline assembly “ri” instead of just “r”. Changing it to “r” seems to compile.

Thanks, I completely forgot about just changing the constraint.
Originally, I thought there are probably cases of integer-to-pointer
conversion, so it'd be ideal to allow encoding an integer directly as an
immediate. But this might be pretty rare in practice, so the trade-off
is probably worth it. I'll retest on the SPEC2006/2017 benchmark suites
and get back to you.