I would like to move the Python sources currently under mlir/lib/Bindings/Python to a new mlir/python directory. I’ve been meaning to simplify this directory tree for a while and now that we have real users, I wanted to ask before just sending a patch. I’d like to make some more comprehensive enhancements to docs, packaging, etc, and I would like to do that in the context of a directory structure we intend to keep for a while.
We would still keep the native libraries in the mlir/include and mlir/lib trees, as with other C/C++ library code, and we would continue to just build the libraries to the right location in the project-wide python directory in the build tree. Build rules and artifacts for generating python code (i.e. for dialects, transforms, etc) would move to the new location. Python tests would move to a new test/python directory tree and be organized by Python package.
Rationale: The mlir/lib directory is really for C/C++ code. When we started the project, the only pure Python code we had was an __init__.py to set things up and we just put that there as well. Since then, the pure Python code has become its own center of mass, providing public APIs and tools for the project, and it is obnoxious to have it so deeply nested in a C/C++ oriented tree. All of these sources are already copied/symlinked to the python directory in the build and install trees, so this is more just aligning the source tree.
As a project that also has a fair amount of Python code, NPComp lays things out this way, and removing the three levels of nesting and presenting a more canonical Python source tree definitely helps the ergonomics of working on it.
The rationale makes sense to me, and the fact that this is how npcomp has been doing it seems like a good proof of concept. Curious what others think. If mlir makes this move, I’d be happy to make the same update in circt to keep us on the same page.
I thought we started it this way loosely modeling it how LLVM does it with a common “bindings” directory. But LLVM has very minimal bindings infrastructure compared to MLIR and it may not be the best examples of how to set it up.
What will we have as inner layout of python/, all .cpp/.h together or separate python/include/ and python/lib/? See Python bindings for out-of-tree projects for more context.
Here’s the first draft of what I had in mind. In this iteration, the mlir/python/ directory contains Python sources and generated Python files. The mlir/include and mlir/lib trees continue to hold the native extensions.
There are multiple ways to look at it, but the one I was choosing is language centric: Python sources go in mlir/python/, C/C++ includes and libraries (for building extensions or inbound include deps on headers) continue to go where they always have: under mlir/lib/Bindings/Python, mlir/include/mlir/Bindings/Python (per the referenced thread), and mlir/include/mlir-c/Bindings/Python for pure C-API interop with the Python extensions (i.e. interop with C-API types without a shared dep on pybind, et al).
Let me know if the above patch is ok in principle and I’ll finish it off (there is a latent install issue that I need to debug some more).