Summary
I would like to clean up how TableGen tools are linked by:
- Adding the LLVMTableGen library to the shared library (libLLVM-*.so) build if enabled
- Linking tablegen tools other than llvm-tblgen dynamically against libLLVM-*.so in builds where this is enabled
I have a stack of changes towards this goal, ending in ⚙ D138278 TableGen: honor LLVM_LINK_LLVM_DYLIB by default. Please review / give feedback: this is a dark but important corner that it is typically difficult to get reviews for in my experence.
Rationale
Most tablegen tools link against both LLVMSupport and some project-specific support library, like clangSupport or the MLIR PDLL implementation. Those project-specific libraries also tend to link against LLVMSupport directly or indirectly for obvious reasons.
In LLVM_LINK_LLVM_DYLIB=ON build, the result is that tablegen tools link against LLVMSupport both statically and dynamically, via the two different paths:
- tablegen tool → LLVMSupport (direct path, static)
- tablegen tool → project support library → LLVMSupport (indirect path, dynamic)
On some toolchains, the tablegen tool process ends up with duplicated global variables from LLVMSupport, which unsurprisingly leads to bugs. (It’s actually more surprising that it hasn’t lead to bugs earlier than now.) So ultimately, the goal here is to make our linker story more robust.
Now, Chesterton’s fence: Why have tablegen tools been linked statically so far? I believe the answer is that llvm-tblgen must be linked statically (to avoid a circular build dependency) and then other tools just copied whatever llvm-tblgen did without revisiting that part of it. There simply was no need to revisit this until quite recently.
Then there is the argument, documented in one of the cmake files, that LLVMTableGen is an internal library, so there is no need to include it into libLLVM-*.so. The extensive and good use of tablegen in MLIR proves that this stance has become outdated. There are certainly heavy users of tablegen outside of core LLVM itself, which suggests that downstream use of tablegen is very much in the cards as well. (I’m somewhat biased because I happen to have such a project
)
The same cmake file also documents that LLVMTableGen wasn’t included to avoid polluting the command-line option namespace, but that’s easily addressed with an established pattern for registering options explicitly, which one of my patches does.
Finally, you may ask about native builds. Tablegen tools are also built as part of a recursive cmake call for cross compilation and for non-Release builds with LLVM_OPTIMIZED_TABLEGEN=ON. Those native builds should be kept small: they shouldn’t build the full libLLVM-*.so. However, that is not a problem because the recursive native builds always use the default Release build, which has LLVM_LINK_LLVM_DYLIB=OFF.
Alternatives
The alternative option that I started with was to create duplicate versions of support libraries: one meant for dynamic links and one meant for static links. This was actually landed for clangSupport in ⚙ D134637 clang-tblgen build: avoid duplicate inclusion of libLLVMSupport / commit dce78646f07f. However, the same type of issue also exists in MLIR land and affects many more libraries (mostly around PDLL and LSP). The duplication of libraries quickly got out of hand – I spent more time trying and failing to make this alternative work cleanly than I did so far on the proposed solution.
In general, I think the option that I am advocating for is clearly better because it fixes the underlying problem by making our build system simpler. Fixing an issue by making code simpler always makes me happy.