[RFC] Starting in-tree development of python bindings

Hi all – @mehdi_amini and @ftynse were offline discussing next steps for the work on MLIR python bindings that @zhanghb97 has started as part of a GSOC assignment. We realized that discussing around this had taken a bit of a twisty path through the forum and was never raised specifically as an RFC. This message seeks to rectify that.

Background on prior threads:

Approach

We would like to begin taking incremental steps to in-tree Python bindings that:

  • Seek to provide a Python-flavored but largely isomorphic mapping of the low-level MLIR API surface area (IR, Pass, etc).
  • Are structured as a native pybind11 extension.
  • Are built on top of a co-developed C-API.

I would be happy to discuss/defend any of those design points, but in previous interchanges, there have not been serious objections, so I will leave further elaboration to be done by request versus just writing a long RFC if there is already general agreement on approach.

We would like to approach this incrementally, starting with committing the boiler-plate code to build/test, then defining the Context facilities to parse asm, query the IR and traverse the core Operation structure. Then we will seek to add mutation/construction and builders. Finally, interop with outside/custom types and attributes will be something that likely requires a more detailed design/prototype and will be easier to see once the basics are in place.

Potential issues

We are concerned and keeping an eye on how this evolves on a couple of fronts:

  • If we only provide the low-level interfaces without any of the ODS generated code, this starts to feel like a “JSON of IRs” which is not deemed a great final state (i.e. it may be appropriate for some integrations to use these low level APIs but there should be a high-level ODS-integrated story to be considered complete). There are a number of ways these higher level parts of the API can be constructed and it isn’t clear yet what the right answers are.
  • It may be prohibitive to model everything with a pure C-API, and it is trivially easy to get started by wrapping some of the C++ classes. However, this creates long term ABI issues that would be best avoided. We may, however, proof some aspects out with a C++ wrapping while co-developing the C-API. For “core” parts, such intermediate states should not be considered indicative of where this will go.
  • It is clear that higher level APIs, geared towards easy meta-programming are likely quite attractive. We would like to revisit these after the core API layer is established.

Related work

  • Some pragmatic bindings have been written in npcomp. It is a goal to eventually get to functional parity or at least be able to redirect these to depend on/augment core bindings exported by MLIR directly.
  • The original OSS repo had some Python bindings developed around EDSCs that have served as some inspiration (these represent something that we would like to enable the creation of, built on top of the lower level work we are proposing here).
  • I wrote down integration and style principles here and testing principles here.

Prototypes

  • Initial boiler-plate for Python bindings patch – I wrote this last night to identify how to integrate build/testing for such a thing. If there are no objections to this RFC, I would like to decide on final names/directory structures and commit this patch.
  • “Reboot” of the CAPI boilerplate – Alex wrote this last night as an example and will be following up with his own RFC. Presented here to give an idea of naming/structure.

Thanks!

1 Like

Same here, I would like some quick feedback on the naming scheme and directory layout. The guiding principle for C API is to make it sufficiently powerful to be used standalone or support other bindings, ant not be Python-specific. It can be demand-driven by what Python bindings expose, but I would love to have a second active client (Rust? Swift?) so that we can see the trade-offs in practice. A more detailed RFC to follow.

1 Like

@drom had expressed some interest in NodeJs. Also @jingpu had previosuly +1’d the utility of having these.

Thanks for the RFC! It is great to see work in this area. Hopefully, I can contribute to some of it.

Also @jingpu had previosuly +1’d the utility of having these.

Yeah. I would probably like to use the C API standalone, so I might not able to provide feedback on the utility for building other language bindings.

After the boiler-plate patch is committed, I think I can contribute to the parsing, query and traversing part, I have completed some of them in my previous journey.

I’d love to play with Swift bindings built on top of a good C API for MLIR core data structure traversal and creation routines!

https://reviews.llvm.org/D83279 is ready to land, and I’ll be doing so unless if there are objections/further comments (feel free to comment)

I’m generally favorable, but I would like to enquire on the maintenance of these bindings and if we intend to have some form of “Owners” for language specific bindings. If I make a C++ change that breaks the python API, what is expected of me? What happens if/when I can’t feasibly understand how to fix it? For python this likely wouldn’t happen very much given that I feel most people are at least somewhat familiar, but for other newer languages such as swift or rust I feel that these can become real challenges. Can we set the record straight during this RFC on what is the expected level of support for these from normal contributors? and how we can make sure that the bindings are relatively well maintained?

Thanks for the RFC, the rest of it looks great to me.
– River

Thanks, River for speaking up.

Perhaps we can separate this into short and long term. In the short term, the feature is off by default and I wouldn’t place any maintenance burden on anyone outside of the people working on them. I think that changing that status will require a further review, and I would expect that it minimally means that we’ve gotten to the milestone that the python bindings wrap the C-API and aren’t pulling in random parts of the C++ code base. As such, it is my hope that the C-API is the main concentration point for joint change maintenance. There’s no getting around that having such things adds complexity to API changes – and I think we need to look at it and decide we have the owners coverage and structure in place before enabling the feature. I’ve got enough of a motivating interest here that I’ll act as code owner through these initial phases or until someone else steps up. Is it enough to say we’ll have another RFC and review before enabling by default or having any expectation of maintenance from parties not working on it directly?

As you say, python is something of a known quantity (it’s been in tree before and has a pretty high level of “ambient” awareness). Over time, I think we need to evaluate the others case by case based on community need and expectation of stability/overhead. It would be really useful, for example, to see just what a swift wrapping of the C-API looks like before deciding. Developing it to the level that we can inspect it seems like it could be done in a fork or separate repo to get started? In my opinion, the path here should be incremental: having two language bindings in development against the C-API can offer good design balances, but we want to have some confidence before scaling.

Wdyt?

What you’ve said sounds good to me. I just wanted to make sure that this aspect is being seriously considered before moving forward.

– River

I see it similarly to how we handle a change in LLVM breaking a particular backend, or one of the change not compiling on Windows when you don’t have access to it: the community should have enough “owners” for a particular language (in the case of bindings) or platform (Windows) to provide support and collaborate on unblocking progress in such cases.
If we end up with a bindings that does not have a set of owners, it becomes a candidate to be removed (like a backend in LLVM that wouldn’t have a community anymore).
WDYT?

SGTM wrt thinking of these in the way we do for backends.

As part of the discussion on the ODM, as part of this work, I would look to promote these two messages to a documentation of some form: integration and style principles here and testing principles here.

The initial boiler-plate has been submitted.

Please see a doc distilling design and style: https://reviews.llvm.org/D83527

I’ve distilled that from various threads and my experience having built this pragmatically a couple of times now. Feel free to debate any controversial parts.

Thanks for you initial boiler-plate and the doc, it helps me a lot. I am trying to add my previous bindings to the boiler-plate according to your doc.

I’d like to confirm my understanding about the submodules section. As you said in the doc, sub-packages should be defined to make Python bindings easier to understand, such as mlir.ir, mlir.passes, etc. Does this mean we need to define a series of header files (like IRBindings.h, PassBindings.h, etc) to implement those bindings separately?

Take IR as an example, in the MainModule.cpp, I call the IRBindings(m) function in the PYBIND11_MODULE to involve the ir into the _mlir module:

PYBIND11_MODULE(_mlir, m) {
  m.doc() = "MLIR Python Native Extension";

  m.def("get_test_value", []() {
    // This is just calling a method on the MLIRContext as a smoketest
    // for linkage.
    MLIRContext context;
    return std::make_tuple(std::string("From the native module"),
                           context.isMultithreadingEnabled());
  });

  IRBindings(m);

  // ......
}

The IRBindings(py::module) is defined in the IRBindings.h, it defines a submodule ir and corresponding classes:

void IRBindings(py::module m) {
  // Submodule of mlir - ir
  auto ir_m = m.def_submodule("ir", "MLIR IR Bindings");
  // Operation bindings into ir module
  py::class_<Operation, std::unique_ptr<Operation, py::nodelete>>(
    ir_m, "Operation", "MLIR Operation")
    .def("getName", &getOperationName);
    // ......
}

Does my example meet the requirement in the doc? Also, I’m not sure about global initialization, it seems that they should have their own module, in addition to registerAllDialects, what other parts belong to global initialization? As for the registerAllDialects, we should use mlir.registerAllDialects() to initial Dialects, or we should wrap it with other global initialization together in a initial function?

What I would do is create one BindingModules.h header file that defines function prototypes for all of the modules packaged in this repo (no need for one per module):

/// Populates bindings for the mlir.ir submodule.
void populateIRSubmodule(py::module m);

In past projects I’ve gotten into the habit of the outer module (or the PYBIND11_MODULE constructor) actually calling auto irModule = m.def_submodule("ir", "...");. Then pass irModule to populateIRSubmodule. Totally a judgment call, but I’ve found that doing it that way composes a little bit better to weird situations.

You can start by keeping BindingModules.h in the lib directory (private header) and we can promote it as a public header (in the include tree) later when further down the road.

Also, do note that we intend to base these on the C API, not the C++ API. However, since the C API doesn’t exist yet, I am open to prototyping a couple of these as you have it against the C++ classes. That will let us get the overall structure and tests in place a little bit instead of doing it all in lock step. Just be aware that we will need to rework it in tandem with building out the C API and don’t get too far down a path we will need to change.

I pass the outer module is because I found the Python bindings of IREE pass the outer module directly and then call the def_submodule in the SetupCommonCompilerBindings:

PYBIND11_MODULE(binding, m) {
  m.doc() = "IREE Compiler Interface";
  SetupCommonCompilerBindings(m);
}

I’m curious about the weird situations you said. What’s the difference between passing outer module and passing submodule?

I get this, looking forward to the C API.

Good question: this actually demonstrates my point but it’s a bit twisty due to the way we split the extension modules into multiple .so files in IREE.

In IREE, that is building the module pyiree.compiler (also pyiree.tf.compiler and pyiree.xla.compiler). Here we chose to have a different .so file for each leaf package – not one for the top level (pyiree). We did this because each one has extremely different dependencies and we wanted to be able to distribute them separately. It originally was just one mondo module and we split it up this way because the deps and size are unmanageable otherwise.

The equivalent for MLIR might be if we had a different .so file for each of _mlir.ir, _mlir.passes etc. Now those are simple packages and we’d probably never split them that way. But there might be a case for something like, say, the cuda runner, if we ever wanted to bind that.

If you split it up like I suggest, then moving these boundaries is trivial and having functions that take a module to populate instead of having it build the module, means you can call it from anywhere to populate differently named modules at different levels from where placed initially.

1 Like

Also, there are some inconsistencies in IREE’S python bindings that I’m looking to not carry forward. Like I said: they were originally a mondo extension and there are still some remnants of that which were not fully refactored.

Doc is online: MLIR Python Bindings - MLIR ; thanks @stellaraccident!