Generating completely position agnostic code

I'm on a mission to generate code that can be loaded from disk without
any modifications. This means no relocations can occur.

Trying to see if this can be done for C++ code that uses STL but has
no global variables, and a single function, but of course Clang will
generate more functions for STL code.

I want to provide an array of function pointers so that for all
interactions STL needs to do with LIBC that I'm able to just provide
it via indirect calls.

Has anyone had success with such a thing in LLVM?

Qs for you:

The code that is being loaded from disk… is it wholly self-contained, or is your executable potentially made up of several pieces that each need to be loaded from disk?

What does it mean to use the STL but not have global variables? std::cout is a global variable, so you can’t even do Hello World without globals.

= = =

Architectures such as 68K and PowerPC and RISC-V have a dedicated register for accessing global variables, rather than the PC-relative globals used in other architectures. This makes them inherently more amenable to what you describe, since you can put the “array of function pointers” into global space, as part of setting up global space in general, and then load the code from disk, and go. There is no relocation needed since all access to globals is done via the global register, not relative to wherever the program was loaded. Of course, access to something like libc might normally need post-loading relocation, but if you do what you’re talking about and use an “array of function pointers” to get to libc, no relocation would be needed.

For what it’s worth, the original 68K-based Macintosh used a scheme quite similar to this. The big difference for the Mac was that to get to the OS (the equivalent of libc), it didn’t use an array of function pointers, per se; it used a certain range of illegal instructions, which generated exceptions when used, and the (highly optimized) exception handlers would recover from the exception by dispatching to an OS routine determined by the specific bits in the illegal instruction.

It is wholly self-contained. It's code that has no references to
anything beyond a set of pointers passed in as arguments to the
function. This piece of code doesn't do any OS work at all. It is
purely calling function pointers, doing math and allocating memory.

I'm not sure if you are wanting to modify LLVM to achieve your goal,
or just use the functionality that already exists. If you are willing
to make changes there are a couple of options in the ARM backend
-fropi and -frwpi that are close, but unfortunately don't support C++.
My understanding is that there are constant data such as vtables
containing pointers that you would need quite a bit of work to turn
into something that wouldn't require some kind of relocation. The
initial RFC has an explanation
there is a mention of a -fallow-unsupported option to allow c++ use,
but I expect that this would only work for a subset of C++.

I don't think that this is the same problem that you are trying to
solve here though. I'm guessing that you are providing a fixed address
libc external to the position independent code that you interface with
via a table of pointers? I have seen that being done, one way of doing
it is to provide the linker with the address of the the libc functions
via absolute symbols, the table of function pointers uses something
like the linker --wrap symbol to do the indirection.


I'd definitely like to throw my vote behind this. I've managed to
write what are effectively bare-metal BIOS bootloaders for dedicated
tasks on x86 using LLVM, but I did have to modify the x86 backend to
achieve it, and it isn't complete. Most of the work came down to
modifying the functions that handle reference classification.
Combining this with building via lld and code extraction via objcopy
allows me to do most of what I need. That said, this is from C code,
not C++; I'd be really interested to see it working with C++, but
complex static initialization seems to be a difficult problem in that

What architecture do you need this for?

Does the code in question ever use more than one thread?

Why? Implementing self relocation is not that difficult really. It's
also quite difficult to avoid unless you can make sure that no pointer
variables are statically initialized. That means no arrays of strings,
no vtables etc.



I'd really like to not modify LLVM. What I want is conceptually using
LLVM to generate machine code with no data references outside a single
stack frame, and everything else is using pointers. This is merely a
snippet of assembly code that will be invoked by other full C++

Because I don't think it is required. All I want to do is generate
machine code for a single function that has all its references coming
in as pointers on the stack and it's just calling functions and doing
math. I'm literally asking LLVM convert text in the form of C to x86

Maybe I'll paste LLVM IR to make it clear what I'm asking for.

Ah, good. Well, maybe.

The reason I asked, originally, was that if you can use thread-local globals rather than normal globals, then all references to globals will be through a dedicated register, and it’s generally easy to set up that register in your entry point, since you say you only have the one entry point.

On x86 it’s a bit more difficult, since it uses a dedicates segment register for thread-local variables… not just a regular register. And it tends to want to check to see if the thread-local variables are set up in each routine, before using them. And it still generates relocations.

So on x86 I’d consider using a different approach a dedicated register of your choice, e.g.:

struct globs {
char flag;

globs *gp asm(“rbx”); //storing global variable in register explicitly

void f(char *cp) {
gp->flag = *cp;

int entry_point(globs *globs_ptr, char *cp) {
gp = globs_ptr; // set up access to our globals
return gp->flag;

See for generated code.

Of course, this still means you have to modify the STL (and whatever other libraries you might want to call) to use this global register to access all its global variables… but it beats having to modify every routine to add an extra parameter. But I haven’t been able to think of any approach that doesn’t have that problem, and with macros (#define cout gp->cout) it might not be that difficult. And at least this way, no relocations are emitted.