Support for out-of-tree backend passes?

Hi all,

I've been doing some LLVM development recently, as was curious about
the status/feasibility of allowing developers to write out-of-tree
back-end passes (e.g. `MachineFunctionPass`es) in a matter similar
to middle-end passes.

From the limited resources I can find online[1][2][3], LLVM currently
doesn't support building back-end passes outside of the source tree.
Could anybody more familiar with the back-end explain the technical
hurdles/challenges involved in supporting out-of-tree back-end
pass builds? Similarly, would work in the direction of supporting those
builds be welcome? If so, it's something I'd like to look at.

Best,
William Woodruff
william@yossarian.net

[1]: https://github.com/samolisov/llvm-experiments/tree/master/llvm-backend-x86-machine-passes
[2]: http://aviral.lab.asu.edu/llvm-writing-a-backend-pass/
[3]: (Thread) https://lists.llvm.org/pipermail/llvm-dev/2015-November/092030.html

From the limited resources I can find online[1][2][3], LLVM currently
doesn't support building back-end passes outside of the source tree.

I think that's right.

Could anybody more familiar with the back-end explain the technical
hurdles/challenges involved in supporting out-of-tree back-end
pass builds?

One issue is that backend passes are usually pretty order-critical.
In-tree ones have a series of hooks where the target can add its
passes (e.g. addPreSched2, ...). As long as you solve that (maybe via
a new method on MachineFunctionPass?), I imagine the C++ effort in
LLVM would mostly be copy/pasting from the IR-level infrastructure
into the relevant parts of lib/CodeGen/TargetPassConfig.cpp and llc
(and Clang?).

The biggest difference and problem I see would be building the thing,
since the target's headers are going to be needed, but they're
private. That means they're not shipped with LLVM so you'd need the
source (and an active build directory for the TableGenerated files,
lib/Target/XYZ/XYZGen*.inc), which might make the whole project moot
(depending on your reasons for doing it).

It would also probably add a bunch of CMake magic to the project,
which is getting uglier the more I think about it (mixing build &
source content? "make super-install"?).

Similarly, would work in the direction of supporting those
builds be welcome? If so, it's something I'd like to look at.

I'm afraid I'm not sure there, and it'd be good to get more input.

My initial thought is that it sounds pretty invasive for a feature
with limited applicability, but I'm just one cranky individual
imagining proprietary obfuscation passes and other things I have no
business kibbitzing about.

Cheers.

Tim.

Yeah, that's a problem. My particular use case is mostly
target-independent, so I hadn't thought about that.

Well, I'll do some fiddling and see if anything sticks.
Thanks for the response!

William

I think another source of difficulty is specifying what representation the pass runs on. Don’t forget that the back end lowers LLVM IR to SDAG, then the SDAG is selected, converted into SSA MIR, then RA and rewriting happens to take it out of SSA and finally it lowers to the MC layer. It would be ambiguous to invoke
llc -my-pass
unless we are restricting the passes to IR->IR ones, restricting it to pre-codegenprep and generally imposing the same requirements on the passes as exist in the middle end.

Excellent points. My use case does indeed involve just IR->IR,
but there's not much point in exposing the back-end if more
diverse cases don't benefit from it.

In the mean time, I've been successfully implementing what I want
with a combination of a middle-end pass for metadata and an in-tree
back-end pass for analysis. Thanks to all for answering my questions!

William