TL;DR: I am trying to combine two projects that both have their own out-of-tree dialects and corresponding Python bindings, currently with Bazel. I am running into the problem that the symbols of some static variables are defined multiple times, such as the various
TypeIDs of various dialects, ops, etc, which then exists several times at runtime and thus cause problems. I know a solution based on shared libraries but struggle implementing it in Bazel.
I have posted a more detailed question on StackOverflow but the essence is that (1) the symbols of Python extensions aren’t visible from different extensions (even if they have public linker visibility) and (2) symbols from common object files that are linked statically into multiple extensions exist in all of them. The two combined lead to multiple instances of static variables at runtime.
A solution that seems to work is to have the symbols of all static variables live in separate shared libraries that the extensions share. In that case, all Python extensions that link to the same set of shared libraries see the same set of instances of static variables.
The question is how to organize the build system to produce that. My question on StackOverflow contains details about why this isn’t trivial with Bazel.
For the MLIR C API, there is a solution: there is a custom rule called
mlir_c_api_cc_library that defines (1) a target for “normal” consumers, (2) a
*Header target for only the headers that the Python extensions would use, and (3) a
*Objects target that only the (single instance of the) shared libraries would use. For example, the
CAPIInterfaces target looks like this:
mlir_c_api_cc_library( name = "CAPIInterfaces", srcs = [ "lib/CAPI/Interfaces/Interfaces.cpp", ], capi_deps = [ ":CAPIIR", ], includes = ["include"], deps = [ ":IR", # ... ], )
With this rule, the source files from the
srcs argument and the files from the (transitive)
capi_deps argument exist in the
*Objects target but not in the
*Header. However, the source files from any target that is pulled in via
deps either exists in all targets (if it has been defined with
alwayslink, see my SO post for details) or it isn’t exported by the shared libraries that depend on that target. This affects the symbols in the
:IR target: by default, they aren’t exported, and if I set them to
alwayslink, they are exported in all extensions and exist several times at runtime.
The only solution that I am currently aware of that solves this problem technically is to apply the
mlir_c_api_cc_library rule to all transitive dependencies of MLIR, including LLVM (among potentially other things, command line options are defined in static variables, and I ran into run time issues with those being defined twice). This doesn’t sound very realistic, or at least highly non-trivial. Among things I might not have though about, it will require to change hundreds or thousands of targets ~manually and convincing a lot of involved people.
Before I consider embarking into that mission: is this really the only solution? Potentially just a work-around that unblocks me on my current project while we look for a more sustainable solution?