I am trying to use the LLVM libraries to do register allocation on LLVM IR code – and output IR as the result.
There are two problems that arise when we try this :
a. The LLVM backend requires that one goes through all the steps sequentially namely
– Instruction selection
– Scheduling and Formation
– SSA-based machine code optimizations
– Register allocations
– Code emission
Is it possible to emit IR from the 1st 3 stages and then do register allocation on it ?
Normally, we would emit assembly based on the machine/ISA specifications during
b. I have llvm IR in the form of a DAG already. This was obtained by using the llvm;;parseIRFile
function. I am not sure how to provide this ISA as an input to the backend phases as they
seem to accept all kinds of other objects.
Are there any llvm functions that accept a DAG as input so that we can do register allocation on it
Note : This might look like a weird thing to do, but I want to do simulations on IR and getting a
register-allocated IR is useful for that purpose.
llvm uses three different representations until machine code is emitted:
- the llvm language as specified in the llvm manuals, we usually call that IR
- the selection DAG
- machine code, which is often called MI
llvm currently only has infrastructure to serialize the first one. Register allocation only makes sense on the MI representation (before that we have no knowledge about register class or constraints, let alone which machine instructions will be used). Unfortunately MI cannot be serialized at the moment, although I think Alex Lorenz is currently working on adding this.
Thanks. I will also work on doing an SSA register allocation that returns SSA form (IR), since it is not yet implemented.
It’s not implemented because it doesn’t really make sense as a concept. Register allocation is all about making use of a finite set of registers, spilling values to memory if they don’t fit. In SSA form, you have an infinite number of registers and (more importantly) you can only assign to each register once, so there is no way of spilling from a register and then using that register for something else.
To implement register allocation in LLVM IR, you would need IR not to be SSA, and then it wouldn’t be LLVM IR.
Having worked on SSA register allocators in the past I have to say that SSA is actually a good fit for register allocation. However LLVM IR is indeed not. You don't have any target instructions or register classed/constraints. It wouldn't make much sense to designate registers to llvm IR values nor is there a way to express that in IR. llvm has the machine instruction (MI) representation for that.
If I got it right, the intention here is to do simulations in IR. I'm
assuming this is some form of cost function that instead of executing,
would try to predict what would happen if it did run, without running.
For that, doing some form of register allocation in IR can actually be
beneficial from the register pressure point of view.
You could even do that in a generic way by just passing the ABI
register allocation first (what we do during lowering/legalization)
and then just letting the simulator know how many GPRs there are, and
the simulator can then estimate register pressure in basic blocks or
functions by using a poor-man's version of liveness analysis on the
remaining SSA registers in the sub-graph.
While this would be an interesting project, I can't see this being an
integral part of the LLVM codebase, as it would mess up the IR beyond
recognition by current back-ends.
Matthias and David,
Thanks for the insight into the problem at hand
Yes, generally, we would do the reg alloc on MI because it has meaning on an actual architecture we specified. For the project I’m working on, I’m intentionally not using MI, as I need to find out whether it is possible to evaluate code on different architectures, using IR simulations.
Since it is not possible to return SSA form after doing reg-alloc on the IR, I’ll be satisfied with a non-SSA IR, or if you will, NS-IR. This won’t be LLVM IR but I can modify my IR simulator and use it nonetheless.
In my simulations, I am actually running the IR, rather than only evaluating cost function. I don’t want to go too much into the details currently because I am hoping to get some results and publish them
I don’t know if it will be useful for the LLVM codebase yet. If the weird IR proves to have useful application, then maybe that is possible.
In my simulations, I am actually running the IR, rather than only evaluating
cost function. I don't want to go too much into the details currently
because I am hoping to get some results and publish them
Right, then I fail to see the use... I'll wait for your publication.
I don't know if it will be useful for the LLVM codebase yet. If the weird IR
proves to have useful application, then maybe that is possible.
We tend not to proliferate the number of internal representations in
tree, but having it as a separate project (ex. in github) is entirely
In my simulations, I am actually running the IR, rather than only
evaluating cost function. I don't want to go too much into the details
currently because I am hoping to get some results and publish them
Are you familiar with lli? It is able to interpret, or "run" IR without doing register allocation (in addition to being able to JIT things). I'm not sure how close that is to the problem you're actually trying to solve, but there might be some infrastructure there you could leverage.