Thanks a lot for such an elaborative answer!
I'm implementing some variations of graph-coloring register allocators for
LLVM. Many of them perform their phases (e.g. coalescing, graph
simplification, spilling, color selection) in an iterative way. Since
LLVM provides an implementation of the coalescing in the
SimpleRegisterCoalescing class already, I would like to reuse it (even
though I could of course create my own coalescing implementation). But
this class is a MachineFunctionPass.
I'm wondering, if it is possible to use this pass from another pass
(i.e. regalloc) in an iterative way:
I need to invoke it multiple times for the same MachineFunction at
certain places in my regalloc pass implementation.
May be some sort of analysis update can be performed? Or is it
designed to be used just once per MachineFunction?
I did this a while back when implementing iterative register coalescing. My
guess is you're trying to do the same with the Generaslized Graph Coloring
You are absolutely right.
It's not easy to accomplish within the LLVM framework. I've made a number of
structural changes to our code here and it's in a bit of a state of bitrot due
to upstream changes we haven't merged in yet.
I got some of this work merged upstream in the RegisterCoalescer and
RegallocQuery interfaces. RegallocQuery is supposed to be an opaque
communication conduit between register allocators and coalescers such that
the allocator can update the coalescer when it makes changes and vice versa.
Yes. Initially I didn't notice RegallocQuery in the header files. But
after Even pointed out that it can be used, I had a closer look at it.
When I did the full implementation I found I had to make more changes to
Yes. This was exactly my conclusion. Namely, it would require quite
some work to implement a coalescer this way using this interface.
Too many pieces are missing yet.
I have not pushed those upstream for various reasons,
including that at the time the register coalescing code was in a great state
of flux and it was difficult to track. It will be a bit of work to get it all
up-to-date for merging.
Evan is right that coalesceFunction is the primary interface to invoke the
coalescer. The iterative coalescer I wrote implements that function. The
iterative coalescer is also a FunctionPass which does nothing. It's simply
there to fulfill the requirement that the register allocator depends on a
coalescer. This way I could use the SimpleRegisterCoalescer or the new
iterative coalescer in a pluggable manner.
Yes. Using coalescers in a pluggable manner would be the ideal solution.
The generalized graph coloring
allocator holds a reference to the coalescer. SimpleRegisterCoalescing does
nothing in its implementation of coalesceFunction.
The idea with the empty pass is really nice, but looks a bit like a hack
This is all very far from ideal.
What we really need is a PassManager that
knows about multually dependent, iterative algorithms. I thought about this a
bit back in the day and came up with the idea of a hierarchy of PassManagers.
A set of iterative passes could live in a PassManager under the top-level
PassManager and the child PassManager could specify dependencies for all of
the passes it manages. I didn't get very far with this so I don't know it
it's a viable solution.
I was also thinking about something of this kind. But I was too lazy
to come up with a practical solution for it.
The iterative coalescer does its own correctness and profitability analysis
but reuses the code from SimpleRegisterCoalescing to do the actual LLVM IR and
dataflow updates. That's non-trivial work.
The code to do this in SimpleRegisterCoalescing is a bit...well...convoluted.
I'd say that the whole SimpleRegisterCoalescing is a
It does a very, very good job doing aggressive coalescing (and taking
into account subtle subregs related things and updating the
LiveIntervals information), but IMHO it became so complex that it is
almost unmaintainable. I've seen different coalescers for graph
coloring (Chitin, Chaitin-Briggs, iterated register coalescing,
chordal graph based ones) and linear scan (Wimmer-based) register
allocators. But SimpleRegisterCoalescing in LLVM (I particulary like
"Simple" in its name is by far the most complex one I've seen so
far. I think, only Evan can completely understand it by now. And
there are still subtle bugs in it being found every week or two and
fixed in the mainline. It is not a very good sign...
Not that I have a constructive solution for it, but may at some point
it should be split into independent parts that are more
understandable? Or may be it should be even completely redesigned to
become more understandable?
I did a significant amount of work to separate that code out from the dataflow and
profitability analysis that determines when to coalesce copies.
This I think is very valuable work and I'd like to push it back. It's always
good to separate analyses and updates as much as possible so that things can
But I can't promise that I'll be able to merge all of the work I've done. We
have quite a bit going on here right now and I've been moved to other bits of
work. I'll see if I can get some things accomplished out-of-band.
It would be very good, if you could push this code back to LLVM and
make it available for others.
I hope very much that you'll find some time for it!
On my side, after looking at alternatives, I decided to use my own
custom coalescers for now. They are small, clean and do what I need.
And I have even a limited reuse among the three GC-based register
allocators I'm working on (Chitin, Chitin-Briggs, iterated register
coalescing). It is not an ideal solution; but for now I'm more
concerned about the quality and speed of my register allocators. Once
I'm happy with that, I'll probably have a closer look and refactor the
code to make it more reusable as a part of a framework.