[RFC] tblgen-to-irdl tool

This RFC proposes the addition of the tblgen-to-irdl tool to MLIR.

Motivation

ODS today allows us to write the necessary C++ boilerplate for dialects. However, it is not easy to introspect operation and attribute verifiers. Tools that could use these verifier definitions cannot introspect these definitions in TableGen easily. Making the IR definitions more introspectable could lead to tools operating on IR definitions, like Fuzzers, Reducers, Mutators, etc., which use these definitions to improve their strategies.

IRDL, as a dialect definition language, is structured and has support for defining operation definitions declaratively as well as using C++ constraints. Having an IR definition in IRDL form allows tools to introspect IR easily.

For example, let’s say we want a tool to know which operations generate a SSA Value of type i32 from a SSA Value of type i64. Today, we would have to either hardcode this or teach our tool to look into TableGen definitions. With an IRDL definition file, we have an easy, introspectable way of looking into the dialect definition and finding which operations can do this. We can iterate over IRDL operation definitions and check the irdl.operands and irdl.results IR to find out which operation best suits our needs.

Overview

With the addition of the tblgen-to-irdl tool, users will now have the ability to generate IRDL files directly from TableGen files and use them for IR introspection.

The tool is implemented in the same way as mlir-tblgen, but instead of generating C++, it generates IRDL definitions and complements the existing TableGen Flow.

Patch

We have prepared a patch for review that starts with a basic conversion from TableGen operation definitions to IRDL: [mlir][tools] Introduce tblgen-to-irdl tool by Groverkss · Pull Request #66865 · llvm/llvm-project · GitHub and would be glad to receive feedback on it.

IRDL definitions of all upstream MLIR dialects can be found here: affine.irdl · GitHub , which were generated using this tool.

Co-Author: @math-fehr

8 Likes

This is an important step towards using IRDL for more use cases, thanks for this work!

Not all dialect operations are defined in TableGen. For example, Linalg has a lot in xDSL, too.

Thanks for the proposal

I am very much looking forward to an mlir-reduce tool that understands custom and upstream dialects. This patch is an important step into this direction!

IIRC Op(?)DSL generates tablegen files under the hood. If we are lucky, these should be convertible to IRDL as well.

Ah, indeed, into build/tools/mlir/include/mlir/Dialect/. I see three dialects in my build: OpenMP, AccCommon and Linalg.

@Groverkss could you try and generate those files as well? Just curious, not blocking.

I think this is useful. Although this reminds me more of a transpiler in the sense that the output is probably not something that a person writing IRDL would have written.

Looking at the output it wouldn’t seem this is straight forward in the form generated by this tool today.

I see this as saying either we could have a tool that uses ODS definition (and walks the ODS/TableGen data structures) or have one that uses IRDL and default MLIR methods, and regular MLIR you find more friendly to work with than the alternative. Both you have to teach a tool to understand some parts of this (e.g., in TableGen, check for I64, in IRDL check for a c predicate with ($_self.isSignlessInteger(32))).

I’d say this could even just be a different backend/output format for mlir-tblgen, but haven’t looked at code yet/could be good in own tool.

Having looked at it for some time, it is definitely possible to convert this to a “proper” IRDL definition. One solution is to either teach the tool to understand constructs like ‘Or’, 'And, etc…, or to write regexes to handle these cases. I’ve had a previous tool that was essentially doing this.

I think the idea is that here, once we teach the tool what ($_self.isSignlessInteger(32))) means, we don’t have to do it again for any other tool that want to do this kind of introspection. Otherwise, if we have two tools that want to introspect ODS, I’m not sure how we could abstract this kind of information properly without IRDL.

+1. Thanks for the RFC. Having a tool that can convert existing dialects to IRDL will make it very easy for folks to leverage its features!

And +1 to Jacques’s point about teaching the tool more about MLIR methods over time, but I think that regex matching can have its limits. I think the natural thing to do here instead is to lift more concepts out from bare C++ in ODS and give them names, then teach both ODS and tblgen-to-irdl about them.

I generated these dialects separately, since they seem to use different CMake configuration to generate op defs than other dialects: acc.irdl · GitHub

1 Like

The reason I did not make this a backend for mlir-tblgen is that this would make mlir-tblgen depend on MLIRIR. I’m not sure how this dependency affects things as I expect the MLIRIR definitions to be generated by mlir-tblgen and mlir-tblgen would start depending on those definitions.

This is planned :slight_smile: Making TableGen definitions more declarative makes for easier translations to IRDL as well as improved TableGen experience.

This is correct. That piece is also due for a power wash at some point too, so it can be changed if needed to better adapt.

Thank you everyone for your comments and feedback. I just landed this. We will take the feedback provided in this discussion and try to integrate it as we move forward.

1 Like