[RFC] Generalize out-of-tree pass support

Hi folks,

I've been working for a few months on a proposal to make it easier to develop
out-of-tree passes, and have them linked either statically or dynamically,
within LLVM. This includes automatic integration within clang, opt and bugpoint.

The goal is to lower the bar for people who develop out-of tree passes: they can
maintain their code base in a third-party repo, pick it at config time and build
it statically or dynamically.

As a side-effect, this provides a generalization of the polly-specific code
that spread at several locations of the code base (and removes most of the
explicit mention of Polly itself, without removing the functionality, obviously).

Both legacy and new pass managers are supported.

From the review's documentation:

    LLVM provides a mechanism to automatically register pass plugins within
    ``clang``, ``opt`` and ``bugpoint``. One first needs to create an independent
    project and add it to either ``tools/`` or, using the MonoRepo layout, at the
    root of the repo alongside other projects. This project must contain the
    following minimal ``CMakeLists.txt``:

    .. code-block:: cmake

        add_llvm_pass_plugin(Name source0.cpp)

    The pass must provide two entry points for the new pass manager, one for static
    registration and one for dynamically loaded plugins:

    - ``llvm::PassPluginLibraryInfo get##Name##PluginInfo();``
    - ``extern "C" ::llvm::PassPluginLibraryInfo llvmGetPassPluginInfo() LLVM_ATTRIBUTE_WEAK;``

    Pass plugins are compiled and link dynamically by default, but it's
    possible to set the following variables to change this behavior:

    - ``LLVM_${NAME}_LINK_INTO_TOOLS``, when sets to ``ON``, turns the project into
      a statically linked extension

The review also contains an example of pass plugin, `llvm/examples/Bye/Bye.cpp`.

The associated review is available at ⚙ D61446 Generalize the pass registration mechanism used by Polly to any third-party tool, Michael
Kruse has already done a lot of review, I think it's time to gather more
feedback, so if you're interested... jump in!

Hello Serge,

Thank you for doing this - that's a lot of great work that makes LLVM
plugin registration very straightforward to use. The implementation and
the interface make a lot of sense! I've noticed other people on the
mailing list asking for similar functionality, so there's definitely
need for this sort of infrastructure. I do hope that it lands in LLVM
sooner rather than later.

I have a few questions/remarks:
1. As far as I can tell, this mechanism relies on the plugin being
located somewhere within LLVM the tree so that everything happens within
one CMake run. Otherwise the pass registration won't work - at least in
the case of static linking. IMHO this is fine - AFAIK it's not something
that could easily be worked around.

2. Does your patch make `PLUGIN_TOOL` obsolete (e.g.
llvm/CMakeLists.txt at master · llvm-mirror/llvm · GitHub)?
AFAIK `PLUGIN_TOOL` is only needed to enable dynamically loaded plugins
on Windows, but from what I can tell you achieved that without using it?
(sadly I've not been able to test your patch on Windows)

3. At the coming LLVM Dev Meeting I'll be presenting a tutorial on
writing LLVM passes for beginners and I'd like to advertise this patch.
Is that OK with you? I think that this is really worthwhile sharing with
the wider community.

Thanks again for working on this,
-Andrzej

  1. As far as I can tell, this mechanism relies on the plugin being located somewhere within LLVM the tree so that everything happens within one CMake run. Otherwise the pass registration won’t work - at least in the case of static linking. IMHO this is fine - AFAIK it’s not something that could easily be worked around.

One can use LLVM_ENABLE_PROJECTS and LLVM_EXTERNAL_${project}_SOURCE_DIR, so this is fully parametric.

  1. Does your patch make PLUGIN_TOOL obsolete

I’ll double -check and update the patch accordingly if needs be. At first glance I’d say I’m generalizing PLUGIN_TOOL, but missed the error message for unsupported platform.

  1. At the coming LLVM Dev Meeting I’ll be presenting a tutorial on writing LLVM passes for beginners and I’d like to advertise this patch

Good :slight_smile:

I have a few questions/remarks:
1. As far as I can tell, this mechanism relies on the plugin being
located somewhere within LLVM the tree so that everything happens within
one CMake run. Otherwise the pass registration won't work - at least in
the case of static linking. IMHO this is fine - AFAIK it's not something
that could easily be worked around.

There's dual use: without LLVM_<PLUGIN>_LINK_INTO_TOOLS, the plugin
can be imported using the -load mechanism.

2. Does your patch make `PLUGIN_TOOL` obsolete (e.g.
https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Hello/CMakeLists.txt#L18)?
AFAIK `PLUGIN_TOOL` is only needed to enable dynamically loaded plugins
on Windows, but from what I can tell you achieved that without using it?
(sadly I've not been able to test your patch on Windows)

PLUGIN_TOOL must be used to allow plugins such as "Hello" to link
against symbols of an executable. On Windows, import symbols need to
know which library it is imported from. "Hello" defines the executable
opt as source of its symbols, consequently it cannot be imported in
other executables such as bugpoint, llvm-reduce or clang. In addition
to use it, one must also set cmake -DLLVM_EXPORT_SYMBOLS_FOR_PLUGINS
to make the executable export those symbols. However, Windows dlls
have a limit of 2^16 imported symbols and LLVM exceeds that by far.
Therefore there is a util script extract_symbols.py to limit the
exports to those that are most probably used. However, I could not
(yet) make this work with Polly, there are always symbols missing.

But it doesn't even work on Linux since the linker only adds symbols
(from static libraries) that are required within the executable. So if
a plugin needs one that is not added to the executable, the plugin is
in bad luck. IMHO the concept of dynamically linking to symbols in
executables is pretty broken. LTO will worsen the problem since it
will inline public symbols and remove the ones unused inside the
executable.

tl;dr: They are different: D61446 for statically linking a plugin,
PLUGIN_TOOL is for dynamically linking a plugin.

Michael

PLUGIN_TOOL must be used to allow plugins such as "Hello" to link
against symbols of an executable. On Windows, import symbols need to
know which library it is imported from. "Hello" defines the executable
opt as source of its symbols, consequently it cannot be imported in
other executables such as bugpoint, llvm-reduce or clang. In addition
to use it, one must also set cmake -DLLVM_EXPORT_SYMBOLS_FOR_PLUGINS
to make the executable export those symbols. However, Windows dlls
have a limit of 2^16 imported symbols and LLVM exceeds that by far.
Therefore there is a util script extract_symbols.py to limit the
exports to those that are most probably used. However, I could not
(yet) make this work with Polly, there are always symbols missing.

Cheers for clarifying - makes sense. I've tried mixing PLUGIN_TOOL and
the new pass manager and I was also getting errors due to missing
symbols. That mechanism is a bit fragile and ingenious at the same time.

tl;dr: They are different: D61446 for statically linking a plugin,
PLUGIN_TOOL is for dynamically linking a plugin.

Right, thank you, all clear!

-Andrzej

Hi Folks,

some update on the « compiler extension » topic.
The main patch landed last week and received a bunch of update meanwhile. It’s almost complete, so let me describe it some more,
based on a thread [0] on cfe-commits@lists.llvm.org.

What’s this patch all about?

Proposing a common infrastructure to write LLVM pass plugins linked dynamically (a.ka. MODULE in the cmake terminology) or statically
(just a STATIC library) and have them work gracefully within opt/clang/bugpoint.
Polly already had several hooks to provide this behavior. At such, this patch is a generalization of the approach.

What problem does it solve?

Dynamically loaded plugins are an easy way to jump into LLVM, but they are not natural to use for the end-users: no one wants to type

clang -Xclang -load -Xclang /clang/install/path/…/MyPlugin.so my_source.c on a regular basis.

The next step is generally to write a regular LLVM pass, register it in clang / opt pipelines, which involves a few modifications to the original tree.
With this patch, one can write a top-level project, say StuffyDoll, that’s separated from llvm-project source, and use the cmake

-DCMAKE_ENABLE_PROJECTS=clang;StuffyDoll

to have it being compiled alongside LLVM, then use the

-DLLVM_STUFFYDOLL_LINK_INTO_TOOLS=[ON|OFF]

option to switch between the module or static build. When using the static build, the appropriate clang/opt/bugpoint hooks are automatically installed,
which means no modification outside of StuffyDoll.

How should I use it?

Probably through reading the in-tree example [1], after reading the documentation [2] :slight_smile:

Is it working for all platforms?

It’s 100% functional on Linux, waiting for https://reviews.llvm.org/D72493 to be accepted[3] to fully work on OSX, and it only works for static builds on Windows.

[0] http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20200106/300726.html
[1] https://github.com/llvm/llvm-project/tree/master/llvm/examples/Bye
[2] http://llvm.org/docs/WritingAnLLVMPass.html#building-pass-plugins
[3] If you’ve read so far, you might be interested enough to participate in the review :wink: