RFC: Enabling MLIR Python Bindings By Default

Hi folks, I would like to enable the MLIR Python Bindings by default, and per the discussion when we started the project, we would checkpoint before doing that.

What would this entail?

I would modify the MLIR_BINDINGS_PYTHON_ENABLED CMake flag, which is currently a boolean that defaults to off to instead default to ON if dependencies are met:

  • Python3 components Interpreter Development and NumPy.
  • pybind11 >= 2.6

Incidental to this, I would also make the following upgrades:

  • Removes the MLIR_PYTHON_BINDINGS_VERSION_LOCKED and switch to more standard pybind11 build support.

Why do this now?

The Python bindings have been stable for some time with respect to the underlying C-API and have not required out of band patches in recent months. Downstream projects are beginning to depend on them for IR manipulation and running passes (npcomp has a hard dependency, and others like MHLO and IREE may start soon as part of their higher level Python tooling).

Further, we foresee some near-term need to have in-tree dev tools that use the API (i.e. for various LinAlg representations), and it would be best if these were dependent on a feature enabled by default.

Risks?

Some of the CMake machinery for auto-detecting Python environments is bug-laden when considering all platforms, and there is a risk that a system that has a valid Python3 installation is not detected as such without additional flags. We have a fair amount of experience with this and will add CMake messages with suggested actions if these wedged states are detected (and disable building of the Python API). This may require some iteration that can only be detected in the wild by users who have odd setups.

Can you clarify the motivation for this?

In general I always considered that these should be behind a feature flag and never be enabled by default: I’m fairly annoyed by a build features that changes based on what CMake detect “magically”.