Help needed: Installing and Releasing Python-based MLIR Projects

I’d like to identify stakeholders and hopefully volunteers who are the right people to involve for comment/review/work on topics around installing and releasing MLIR-based projects (including MLIR itself). I’ve run in to quite a few things that that I pattern match as not done or not considered yet due to the early state of the project, and it would be good to get a plan together. More practically, I could use connections to some seasoned reviewers on the build-infra and releasing side. So far, my work in this area has been pretty ad-hoc, and in my last install-impacting patch in this area, we agreed that more discussion/planning needs to happen.

The reason I think that this is timely is that I suspect that before the year is up, both the MLIR python bindings and NPCOMP will be at a level where we are going to need some more structure to their packaging and distribution. It is important to put some advance thought into this because, by the nature of the problem (Python extensions) and the upstream/downstream deps (PyTorch, IREE, MHLO), we are firmly in a fairly advanced shared-library distribution scenario, and avoiding a mess requires a design for what each layer is doing. And at this point, I don’t believe we have even considered the distribution channels (OS package management, Anaconda, etc).

Here is a short list of areas that I have noted need attention:

  • Install location and layout (i.e. make “ninja install-mlir” do the right thing): Currently, MLIR installs under LLVM tools and lacks umbrella install targets. During development this is merely annoying in that installing it also installs a large swath of (large and unrelated) LLVM tools. For deployment, it becomes more problematic.
  • Deciding on installation layout for shared libraries and Python modules/sources.
  • Decoupling deployable python extension building from the main CMake build: The way we have structured the Python extension, we actually can just distribute some small source files and a setup.py that builds for any Python version (assuming an overall MLIR install). Something in this vein is how the other cool kids in our category do things.
  • Getting the MLIR Python bindings enabled by default: Needs some build-bot, dep and CMake work (and consensus building that we want to do this).
  • Getting a handle on symbol visibility and making it correspond to API boundaries: Especially relevant since all of our leaf deps only need to link against the C-API, and we should be able to leverage this to create a nice shared library boundary (also relevant for DLL building on Windows).
  • More design work for libMLIR.so: Currently, what makes its way into libMLIR.so is static and uncustomizable. The story is more nuanced when considering how to produce an appropriate shared library to back a set of extension modules and peer projects.
  • "Loader" work in the Python mlir module: I currently have the pure-python import mlir level and the native extension _mlir decoupled with an eye towards being able to customize the way the deps are found and activated, but no actual customization has been done (i.e. it is suitable for dev builds but not installed).
  • Conda recipes, wheel building, etc: Need to build the scaffolding to package artifacts for installation.
  • Interactions with the LLVM release cycle, overall numbered releases, etc: I literally don’t know what I don’t know here and need help figuring out how much any of this relates to official LLVM releases.
  • Porting to OSX and Windows: Porting all of this to OSX shouldn’t be very hard but does require someone with adequate hardware/experience (I know roughly how to do it but don’t develop on a Mac). Windows compatibility will be a heavier lift but I know roughly what is entailed.

Pretty much all of the above is executable right now, and I’d like to invite collaborators. I think that we are close to having some exciting capabilities, and while we can get away for a little while with a poor deployment story, it is going to start really costing us in the coming months if we don’t address some of these things. While there is some ambiguity to how we do some of this, it is fairly straight-forward and could be a good way to get involved.

Thanks and please reach out if interested! If we get a person or two, it’d be great to set up a sync and figure out how to approach.
Stella

I am willing to be a volunteer to help. I have been following this python-based works, and I am generally familiar with the structure. I think I can help with some CMake parts, and I also have MacOS hardware, although not very experienced, I hope to help as much as I can.

I don’t quite understand the symbol visibility part:

What is the shared library boundary here?

It is a relatively involved topic that boils down to best practices for designing shared library solutions that will work across platforms. Essentially the tension is that the historic “Unix way” is to export everything, while the “windows way” is to manage exports/imports explicitly. There is also other minutia but that is the core of it. For C-based APIs, the answer is pretty simple, since there is usually a discrete, readily knowable list of symbols that should be subject to dynamic linking. For C++, the story is much more complicated, and taking the default approach of exporting everything brings quite a few problems (even on Unix where it is more common). Generally, the answer is to design your shared libraries to follow some defensible API surface and separate them based on whether they are serving internal or external needs.

Visibility is related to that. Here is the canonical page on it: https://gcc.gnu.org/wiki/Visibility

For mlir, it takes the export-everything approach with libMLIR, and is, at best, suitable for “internal linkage” with intimately tied components. Probably a good next step would be to create a libMLIRAPI.so which can depend on that (or statically link) and only exports the C-API. Then “external” things link only to that library. That doesn’t solve everything but is a necessary step to most of the good solutions. Slicing it in that way would also readily work on Windows, and would be sufficient for a naive windows port of something like NPCOMP (but would still require some massaging - and there are better things to be done after this step).

With https://reviews.llvm.org/D90824 the C-API builds correctly as a Windows DLL and correctly exports with the minimal symbols in Linux shared libraries, and the Python extensions are also functional (although that will take another couple of patches to work out some path kinks in the test setup).

Haven’t tested on OSX. There are a couple of XCode specific things in there that I don’t fully understand.

I tried on my OSX version 10.15.7, and found that all the Python binding and C API tests failed:

Failed Tests (13):
  MLIR :: Bindings/Python/context_lifecycle.py
  MLIR :: Bindings/Python/context_managers.py
  MLIR :: Bindings/Python/dialects.py
  MLIR :: Bindings/Python/dialects/std.py
  MLIR :: Bindings/Python/insertion_point.py
  MLIR :: Bindings/Python/ir_array_attributes.py
  MLIR :: Bindings/Python/ir_attributes.py
  MLIR :: Bindings/Python/ir_location.py
  MLIR :: Bindings/Python/ir_module.py
  MLIR :: Bindings/Python/ir_operation.py
  MLIR :: Bindings/Python/ir_types.py
  MLIR :: Bindings/Python/pass_manager.py
  MLIR :: CAPI/ir.c

As for the output files:

  • libMLIR.dylib - 45MB
  • libMLIRPublicAPI.dylib - 296KB
  • _mlir.cpython-38-darwin.so - 2.3MB
  • _mlirTransforms.cpython-38-darwin.so - 140KB
  • mlir-capi-ir-test - 86KB
  • mlir-capi-pass-test - 50KB

What error do you see? It just worked out-of-the-box on my Mac.

The error message:

Fatal Python error: PyMUTEX_LOCK(gil->mutex) failed
Python runtime state: unknown

That is a pretty unexpected low level error. If you try to run one of the tests from the command line, can you tell where it fails (ie. On an import or specific call)?

I have python from homebrew detected:

-- Found Python3: /Users/aminim/homebrew/Frameworks/Python.framework/Versions/3.9/bin/python3.9 (found version "3.9.0") found components: Interpreter Development Development.Module Development.Embed 
-- Found python include dirs: /Users/aminim/homebrew/Cellar/python@3.9/3.9.0_1/Frameworks/Python.framework/Versions/3.9/include/python3.9
-- Found python libraries: /Users/aminim/homebrew/Cellar/python@3.9/3.9.0_1/Frameworks/Python.framework/Versions/3.9/lib/libpython3.9.dylib

What does your CMake invocation says on this?

-- Found Python3: /usr/local/Frameworks/Python.framework/Versions/3.8/bin/python3.8 (found version "3.8.5") found components: Interpreter Development Development.Module Development.Embed 
-- Found python include dirs: /usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/include/python3.8
-- Found python libraries: /usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/libpython3.8.dylib

I find use different python3 version in the testing process, python3 Cmake found is the /usr/local/bin/python3, and I set the -DPYTHON_EXECUTABLE=/usr/bin/python3. I think this is where I did wrong, I’m trying again with the right version. Should I add the -DPYTHON_EXECUTABLE, or it will detect automatically?

There was a change in the Python detection logic recently, you should try to wipe out your build directory and reconfigure. There is no specific parameter expected for Python detection.
I just pushed a fix regarding the Python detection change, so try again at HEAD.

Now it works, thanks!

Hi Stella, I’d be happy to help, is there a list of issues or a roadmap/plan besides this thread ? Baden

Not yet, but Stella and I are going to discuss this. See Stella’s message on discord:

FYI - @marbre and I will be having a meeting to discuss/divy up tasks for MLIR build system work relating to shared linking and python distribution. We are scheduled for 11am PT on Thursday 2020-Nov-19 (right after the ODM). Please, PM me your email if you would like to join and I will send an invite.

I did expand on the thread a bit in a PM between me and him, which is what led Marius and I to set up a meeting. PM me your email address if you would like to be added to the invite tomorrow.

Here is the primary excerpt from the thread between us:

I’ve copied my original points below and annotated with more details and status. Those that imply design points have basically not been discussed beyond myself and are definitely open for thought and other options! I think what I outline below will work on Windows/Linux/OSX but there may be simpler/more incremental ways to get there.

Copy of the list from the original post with updates

  • Install location and layout (i.e. make “ninja install-mlir” do the right thing): Currently, MLIR installs under LLVM tools and lacks umbrella install targets. During development this is merely annoying in that installing it also installs a large swath of (large and unrelated) LLVM tools. For deployment, it becomes more problematic.

    • No work done on this: The closest thing we have is ninja install, which installs everything.
    • Stephen Neuendorffer has thoughts about how this should layer but hasn’t had time to realize them.
    • Changing this will likely involve reworking the LLVM RPath machinery.
  • Deciding on installation layout for shared libraries and Python modules/sources.

    • This is from a note on the patch that made the Python bindings installable. It was noted that we shouldn’t just be squatting on the python path at the top level of the install. Note that I had originally attempted to put this under lib/python as that made more sense to me, but chose python because libraries are currently build/installed with an RPath of ../lib, which conveniently works if the .so is one level off of the root install path.
  • Decoupling deployable python extension building from the main CMake build: The way we have structured the Python extension, we actually can just distribute some small source files and a setup.py that builds for any Python version (assuming an overall MLIR install). Something in this vein is how the other cool kids in our category do things.

    • I’m really worried about how all of this composes when dynamically linking multiple projects together (i.e. say libMLIR.so + libNPCOMP.so + libIREECompiler.so) as they all need to be built at the same LLVM revision/flags/etc and have possibly different release cycles. I’ve found that the way that PyTorch solves this is moderately rational: They just follow the usual setup.py approach for Python to install into the packages directory, and then they also install the headers and shared library sufficient for building derived (version-locked) projects.
    • If we were to do something similar, we would provide a setup.py that built the Python bindings against a provided LLVM/MLIR install and then also copied the headers into the Python installation directory.
    • I’ve carefully written the Python bindings to be self contained: it should be possible to just build their sources with appropriate include a lib dirs set: they do not have to build as part of LLVM. This would let you build/install LLVM as intended (the expensive part) and then do a light-weight compile/install of just the Python packages, complete with enough headers to build MLIR derived Python projects.
    • We could still build the extension as part of the main CMake build, which would be useful for tests/interactive use but not really for packaging/deployment (ie. to conda-forge or as wheels).
    • Downstream things like NPCOMP and IREE would have a setup.py that would find the installed mlir python package and compile against its headers/libraries.
    • In this way, when a new MLIR package drops, downstream projects could always just install it and then build/publish their own wheel’s or conda recipes and be guaranteed version compatibility because everything is bundled together in the Python package.
    • Note that PyTorch also has a C++ only install/tools that has nothing to do with Python. We retain that option too – not suggesting we only deploy MLIR via Python channels :slight_smile:
  • Getting the MLIR Python bindings enabled by default: Needs some build-bot, dep and CMake work (and consensus building that we want to do this).

    • Mehdi thinks we can just update the Linux buildbot with flags to build the Python bindings.
    • We should probably modernize the CMake config and make it more bulletproof first though (i.e. check the version of Python and ensure that Numpy is installed).
    • I’ve got an email out about the Windows buildbot: I think the image is set up incorrectly to link to Python (i.e. building 32bit binaries but has a 64bit Python).
    • When ready, we should send an RFC to the list and enable the python bindings by default (if dependencies are met).
  • Getting a handle on symbol visibility and making it correspond to API boundaries: Especially relevant since all of our leaf deps only need to link against the C-API, and we should be able to leverage this to create a nice shared library boundary (also relevant for DLL building on Windows).

    • _Mostly done under https://reviews.llvm.org/D90824
    • There is still some cruft around vague linked TypeIds that I need to study and will need explicit addressing on Windows. Also, some issues with binaries that also link LLVM Support had surprising failures (that we thought should have been fine given the hidden visibility). Needs more attention.
  • More design work for libMLIR.so: Currently, what makes its way into libMLIR.so is static and uncustomizable. The story is more nuanced when considering how to produce an appropriate shared library to back a set of extension modules and peer projects.

    • Currently libMLIR.so is a dumping ground and way too big (e.g. ~21MB stripped vs 5.5MB actually used by libMLIRPublicAPI.so).
    • I’d be less concerned about the size if there was a way to customize it or if we generated separate shared libraries for dialects and other bulky bits (i.e. execution engine and such).
    • It is disabled on Windows entirely, which makes it impossible to have projects like NPCOMP and IREE share the MLIR python API.
    • I feel that we may want to move away from this in some way and instead have libMLIR{Component}Impl.so libraries that layer nicely.
    • Things that need to interface at the C++ level take deps on the libMLIR*Impl.so libraries. Those that just need the public, stable C-API link against libMLIR{Component}PublicAPI.so.
    • The primary case downstream that needs to cooperate with the Impl libraries is dialect and pass registration: downstream projects need to be able to add their own and interop with the core shared libraries. If we model this in the core repo, we would have some guarantee that downstream would work naturally.
  • "Loader" work in the Python mlir module: I currently have the pure-python import mlir level and the native extension _mlir decoupled with an eye towards being able to customize the way the deps are found and activated, but no actual customization has been done (i.e. it is suitable for dev builds but not installed).

  • Conda recipes, wheel building, etc: Need to build the scaffolding to package artifacts for installation.

    • Not started
    • Should be somewhat easy if addressing the setup.py points above.
  • Interactions with the LLVM release cycle, overall numbered releases, etc: I literally don’t know what I don’t know here and need help figuring out how much any of this relates to official LLVM releases.

    • No idea: I just assume that someone at some point will want to stick an “LLVM15” moniker on this stuff vs just having nightly releases to the Python forges.
  • Porting to OSX and Windows: Porting all of this to OSX shouldn’t be very hard but does require someone with adequate hardware/experience (I know roughly how to do it but don’t develop on a Mac). Windows compatibility will be a heavier lift but I know roughly what is entailed.

    • Mehdi was able to confirm that the Python extension basically works on OSX today.
1 Like

I’m at very early stages, but I’ve thought about some of these issues for the Swift bindings I’m working on. I do not necessarily advocate any of the approaches I’m currently taking, nor do I want to derail too much from focusing on Python, but if there is something specific you would like me to dig into please let me know!

I currently link only a subset of binaries (at the time of writing: MLIRSupport, MLIRParser, MLIRIR, MLIRCAPIIR, LLVMDemangle, LLVMSupport). These are declared in a module.modulemap file which is used by Swift Package Manager. I have an auxiliary script which regex-parses this file to create the correct ninja arguments (essentially install-<LIBRARY> and install-mlir-headers) and I build with ninja $(ninja-install-args). Overall, it would be great if we had an install-mlir-c target which would only install that subset (and make the definition of this subset not a purview of the specific language binding). Installing a pkg-config file would also be super helpful as they are a good manifest of what specific pieces are needed. For Swift in particular,pkg-config is the main approach Swift Package Manager uses to link system libraries.

I played around with doing a similar thing in Swift, but ultimately think that adding this type of layer on top is an unnecessary complication. It would be unfortunate if each language binding ended up doing its own thing here. My current hope is that we can leverage the existing llvm build infrastructure to make this work. I recently updated circt and a close-source project I work on to build using LLVM_EXTERNAL_PROJECTS setting. The nice thing about this is that it could obviate the need for similar downstream projects to have their own setup.py, and instead rely on cmake operations like add_mlir_dialect doing the right thing.

The way I look at it (and this may be incorrect) is that libMLIR.so is kind of “The C++ bindings” of MLIR and we could probably have a libCMLIR.so which gets rid of some of that and retains only the core of what is needed for bindings. I’m not sure if further pruning will be helpful.

Yeah, neither do I :slight_smile: I think the shared interest (pun intended) is to align on what the shared-library situation is and then make sure it is done well. That isn’t necessarily python specific – it is just that these use cases require reasonable shared linkage setups so end up at the front of the queue in terms of needs.

I’ve gone back and forth on this, and I think that the core requirement is that the build setup is separable: there are deployment scenarios where you want to install the LLVM+MLIR C++ packages and then build/install the bindings depending on that. What I’m suggesting here is not necessarily to go “all in” on setup.py as the build system but more to do this how PyTorch does it, having setup.py drive the underlying CMake infra to do the right thing for Python install. Such a split also opens up another Python requirement, which is to build the bindings packages for up to a few supported python versions.

We currently have libMLIRPublicAPI.so which is basically your libCMLIR.so (can totally be renamed, and if we were going for consistency with LLVM, it would be libLLVM-C.so I think). I do have some interest in libMLIR.so (the C++ core) either being a) configurable for the subset that is desired (important for code-size and various deployment scenarios that want something less than the kitchen sink), or b) have a couple of shared libraries across some coarse boundaries that get us closer to pay-for-what-you-use. For (b), there are currently some fairly expensive (binary size and depends) components of MLIR that we do ultimately want a Python binding for but don’t want to fully pay the price for everything (ExecutionEngine, JITers, etc come to mind). Dialect libraries are also interesting here because they contribute a large amount to size and deps for things that not everyone needs. I wouldn’t advocate for splitting at a very fine grained level, but I do feel like a couple of principled slices of shared libraries might be the right thing to do. I’d like to preserve the ability for libMLIR.so to be relatively slim and have a shared linkage story for the other useful parts (that are not).

I’m admittedly unfamiliar with PyTorch but this sounds reasonable to me. The main design question is how much to shove into the core CMake infra, and while I won’t argue that that infrastructure is great, its seeming to me that relying on it more rather than less will simplify the overall system. So supporting things like LLVM_EXTERNAL_PROJECTS which add things like dialects to a single shared library (or at least to the set of shared libraries specified in a pkg-config file) sounds like a good thing.

One question for you that belies my ignorance of what you are actually doing: I assume that Swift, as a “real language” is compiling against MLIR as a dep, and that therefore, you have some more flexibility here (i.e. a user of your API can pass some arbitrary flags to include/elide various projects and components, resulting in various levels of fatness of the resulting libraries).

On the Python side, we’ve got more of a runtime vs compile time problem: external projects that depend on MLIR likely need to contribute some additional dialects/transforms that layer on top of a binary-deployment of libMLIR.so: they can’t just twiddle some cmake flags in the base project to tune it. Or to be more precise: only the first project to do so can do that and the rest need to live with the choices made (or fork the package deployments at the root). As the first project doing that, I’m trying to keep an eye on making choices that can also yield solutions for the second project on the list (which I also lead but different hat). It’s a sticky problem, because in the limit, you can only have one import mlir for an installation and that is resolved entirely at runtime/install time, not compile time.

I fully support the mechanic you are describing for a configurable libMLIR.so. But when considering Python deployment, we unfortunately also need to have some sane configuration for that which composes for runtime deps, which does put some more pressure on the design. I’m looking for ways to get some of what we need in that world without adding undue systemic complexity.

Also, glad you’re here/in the conversation. It’s good to have someone else to talk to about these things :slight_smile:

Likewise!

Good point. I think in Swift we have a similar problem despite being ostensibly “real”. I can imagine a package which depends on two or more “dialect” packages, all of which depend on the MLIR Swift bindings. In this situation, I would imagine the bindings would bring in libMLIR.so, and the various “dialect packages” would bring in their dialect-specific libraries and all of this would work seamlessly (assuming no MLIR version shenanigans).
I still think its valuable for the LLVM infrastructure to handle all dialects in a consistent way, which makes me think that the best course of action would be to utilize pkg-config since we can generate one for MLIR-C (the minimal set of interesting stuff we need) and a sepearate pkg-config for each external project (MLIR_EXTERNAL_PROJECTS?) which would link MLIR-C.so along with the project-specific dialects. This way, if you end up bringing in two external projects (assuming they depend on the same version of MLIR) you could combine the pkg-config linker flags into something like -lMLIR-C -lMLIR-C -lDialectA -lMLIRC -lDialectB which should be fine.