Python bindings for out-of-tree projects

I would like to define Python bindings for an out-of-tree MLIR client that uses custom dialects. It needs quite deep IR inspection capabilities – traversing the IR with a mix of standard+custom operations, inspecting and setting builtin+custom attributes.

Currently, it doesn’t look like the bindings are set up for such a use case. I started some refactoring to achieve the following state:

  • most C++ bindings code is available as a library so that out-of-tree projects can #include headers related to bindings and use PyMlir* classes in their bindings code;
  • the bindings entrypoint, i.e. the _mlir module definition, is a separate library that depends on the former and only defines the extension module;
  • same for the out-of-tree project, which has its own entrypoint library defining the _external_project module, and other extension libraries for its dialects.

My project has a custom build system so I would appreciate if somebody reflected this library layout in CMake. Currently it is still building a combined library.

An extra design point is making external dialects available in the context. Currently, my approach resorts to something like

with mlir.ir.Context():
  external_project.load_dialects()
  # do stuff

but this looks obnoxious. I am considering to provide a dialect registry inside the bindings that can be populated when the dialect package is imported and appended to all live contexts (Python keeps track of those). This could look like

import external_project.dialects.external_dialect
with mlir.ir.Context():
  # do stuff directly

or, if we want a more opt-in registration behavior, like

import external_project.dialects.unregistered_dialect
import external_project.dialects.external_dialect.registration
with mlir.ir.Context():
  # unregistered_dialect is not registered 
  # external_dialect IS registered when the package was imported

Thoughts or suggestions?

@mikeurbach @jdd @stellaraccident

1 Like

Thanks for opening this discussion @ftynse. I recently experimented with something very similar in CIRCT, and I think we have the same end state in mind. A couple specific points:

I started prototyping this, and I can try to help out on that front. As a concrete use-case CIRCT is using CMake and will make use of the refactored library.

I would add builtin+custom types as well, if that wasn’t implied. I don’t think it will be too different, and PyConcreteType was the first refactoring I had in mind as a follow up to ⚙ D101063 [mlir] Move PyConcreteAttribute to header. NFC.

This is the same pattern we have been using: circt/rtl.py at 1ae59c60d52176d8ad65d386900fc94857104c08 · llvm/circt · GitHub

This came up recently in code review. Having a mechanism to register dialects when the package is imported sounds good to me. I don’t have strong opinions about exactly how it works, but your examples both seem like an improvement I would be happy to use.

One last point, which @jdd mentioned in D101063, but is probably worth discussing here in general. In the end state, would the refactored headers be moved out from lib/ and somewhere into include/? This seems more natural to me, but I think it’s worth discussing exactly where to put them.

I’m quite concerned about the “pollution” associated with implicitly registering/loading a dialect in every available context. This leads to surprising behavior when trying to compose libraries / components in the same process, which is why we eliminated it entirely from MLIR. It isn’t clear to me why this should be reintroduced here?

Note also that registration and loading are different: it isn’t clear in your last option that external_dialect is loaded in the context or nor?

It also does not really address the “global pollution” aspect as it seems that every possible context created in the process in the future (even in unrelated libraries) will register (/load?) this dialect.

Thanks! I suppose we can split this by having one patch that rearranges the files but keeps the overall cmake structure and another patch that just changes the cmake structure.

Certainly. Attributes and types use the same mechanism.

I support the idea of putting them in include/mlir/Bindings/Python, which we already use for .td files anyway. The headers were originally “private implementation” so they were put in lib/ for the same reason as PassDetail.h and the likes.

The rationale I heard before was that loading all dialects is too expensive, not that it creates composability problems. Do you mind giving some examples of the problems you’ve seen?

I’m concerned about boilerplate and uncommon patterns in a productivity-oriented language. import something is the common way for Python to get something available without having to manually deal with package initialization.

Whatever lets me do external_dialect.Attribute.get() without any extra action. Well, I can live with with mlir.ir.Context(loadAll=True) if necessary.

Only those contexts visible to Python. Again, this sounds like an extension of the common behavior in Python, importing something at the top level makes it globally available.

No: it is first a composability problem, the performance is a side-effect that was added there. But the performance aspect is a symptom of the lack of composability.

Performance remains: loading more dialects will affect performance. But also behavior can change, for example loading a dialect adds new canonicalizer rules. There is also an aspect of potential name collisions.

I’m concerned about boilerplate and uncommon patterns in a productivity-oriented language. import something is the common way for Python to get something available without having to manually deal with package initialization.

I don’t see this as “package initialization”. This would imply global state and I’m against global registration/initialization: this is about instance initialization instead: every single new context has to be initialized/setup correctly / explicitly.

Users can “hide” this in their own coherent framework (like npcomp, etc.) but I’d really like MLIR (including MLIR python) to avoid locking ourselves into this.

Joining a couple of parallel threads:

Cross posting a comment from the former:

Generic concerns from me:

  • This does not scale beyond a single out of tree project controlling the entire Python environment by providing the mlir native libraries and its project libraries (i.e. circt). If another MLIR project wanted to co-exist, it would need to be built against the same installed MLIR libraries.
  • The compatibility rules for pybind11 to interop at the C++ level between multiple shared-library extensions are arcane and hard to satisfy for things not intimately built together (i.e. basically the tuple of (compiler, standard library, compile options, etc) all have to match and are checked specifically. Penalty for violation is that the types at the Python level are actually forked (the binding is actually stored in a dict in the Python builtin module with a key hard-coded to the above parameters). This usually causes chaos to ensue and is not completely obvious.
  • We still have the lingering TypeID linkage issue (I’ve been hunting it down for 6 months and haven’t run it to ground completely yet). This will work one day and not the next, based on past experience. This bug unfortunately can happen when you obey all of the other rules. I have to assume we’ll track it down eventually…
  • Other projects who have gone down this path have eventually had to grow a platform specific loader stub which can adapt things at runtime and work around issues that come from distributing binaries.
  • Those other projects have typically also had to embrace a deployment strategy that lets them pip install headers/dev libraries and rework end projects to build against that (which provides a bit of a source of truth to make sure everything is at the same revision/config options/etc). LLVM’s source instability will hurt here since we have very limited and unpredictable compatibility windows between projects.

An answer that neatly side-steps this for projects like circt and npcomp, which merely want to use MLIR for their own ends, and have limited benefit from joining on a single shared installation, is to move the MLIR Python binding in a direction so that it can be privately embedded in such a project (i.e. as an circt.mlir private module vs a shared dependency on mlir). I’ve tried to keep the source tree in a rough shape to make such an evolution practical at some point, but it would be definite work.

Forks of the conversation:

This thread is dealing with a) decoupling dialect registration for out of tree projects, and b) externalizing some of the helpers used within the _mlir.so module for use outside (PyConcrete{Type|Attribute}). I don’t have recent state on the former. On the latter, I would like to do an experiment where I rewrite the useful helpers in a way that doesn’t leak internals – and then use that.

FYI - I finally got annoyed enough to write out a better system for custom attributes and types, and this one is compatible with out-of-tree from the get-go:

See ESIModule.cpp for how it is used. Strict improvement, I think.

Picking up on the with mlir.ir.Context() as ctxt:, this has actually come up as a problem for what I want to do.

Say I want to have class members which contain MLIR object, in particular types:

class Foo:
    intType = mlir.ir.IntegerType.get(32)

I have to wrap that class definition in a with block, which makes for some extremely awkward code. Further, I want to be able to import those classes from a different file but in the same context which could be solved with:

with mlir.ir.Context():
    import foopackage

Which is mildly less awkward but completely unintuitive to my end users (I’m exposing wrappers around CIRCT as a python library).

Since a (if not the) common use case (at least for me) is only having one MLIR context for the lifetime of the python program, how bad of an idea is it to create a Context automatically on import mlir.ir and set it as the base of the Python default context stack? I’m pretty sure this wouldn’t break anything – existing with Context():s would simply push a new context onto the stack for use and ignore the initial one. We wouldn’t have to register any dialects into it – if classes want to use custom types and attributes on import, they’d be responsible for ensuring the dialect is registered. (But that would be easy as it could be done with a single statement at the beginning of their python module.

This just does not seem right to me, and can’t compose across libraries, I am very wary of the coupling this would impose, and how it would limit the possibility to use MLIR to only short-lived python programs. This isn’t the kind of assumption that we should do in library design I believe.

For such use-cases, shouldn’t use pass explicitly a context to the class Foo __init__ method (or use the same mechanism to default to lookup the current default context and create these members at the time of the instantiation?

I actually need that type information at __new__() time, but I’m using a decorator to add the __new__() method and I don’t want to require my users (who would be the ones writing these classes) to have to implement __new__() themselves or know anything about mlir.ir.Context.

Here’s what I came up with: circt/design_entry.py at 06c58f32a32a02f6f107d3221e276bc29123cdbe · llvm/circt · GitHub. It’s problematic since it won’t play nice with other libs which need the same functionality… In other words, it wouldn’t compose well. However, it does demonstrate that adding an ‘initial’ context doesn’t have to be the default behavior to fit my use case. We could add a static method AddInitialContext which adds it if not already present and takes care of destruction on exit.

Couldn’t you just make an unpaired call to mlir.ir.Context().__enter__() manually? Not saying it’s a good idea, but it would have the same effect as what you’re proposing.

Stuff crashes on exit if we don’t pop it off the stack atexit. (Not sure what happens on a unexpected exit.) Also, if multiple libs do that when they all need to share a context, that wouldn’t happen. (I have the same problem with my current solution.) I could just check whether or not there is an active default context and call Context().__enter__(), but the active/default context could be pop’ed anywhere unexpectedly.

Been offline for a few hours and missed some things. A few responses inline.

I’ve hit and almost fixed this a number of times – it can be fixed. But it is also a somewhat dangerous design pattern except for specific purpose scripts and such. I should fix it if for no other reason than that segfaults are bad and you shouldn’t be able to do things Python side that cause crashes.

I think you are solidly in explicit context land for what you are doing. I think it is completely natural to create domain specific wrapper APIs for MLIR APIs, but we want to be careful about polluting the shared “namespace” of the current context stack. On the npcomp python compiler side, I just created a base class that holds a context and then used explicit context APIs for parts of “the system”, only establishing an implicit context for user code. That may be vague. I’m pretty confident that some fiddling at the Python level could produce something reasonable here, but I have trouble coming up with such things without trying to thread the needle. There are a million ways to do things in Python and finding the thing that feels right is a bit of discovery.

1 Like

Could you link me to that and an example of its usage? I’d be most curious to have a look and see if it fits or can be made to my requirements…