This RFC proposes the addition of the tblgen-to-irdl tool to MLIR.
ODS today allows us to write the necessary C++ boilerplate for dialects. However, it is not easy to introspect operation and attribute verifiers. Tools that could use these verifier definitions cannot introspect these definitions in TableGen easily. Making the IR definitions more introspectable could lead to tools operating on IR definitions, like Fuzzers, Reducers, Mutators, etc., which use these definitions to improve their strategies.
IRDL, as a dialect definition language, is structured and has support for defining operation definitions declaratively as well as using C++ constraints. Having an IR definition in IRDL form allows tools to introspect IR easily.
For example, let’s say we want a tool to know which operations generate a SSA Value of type i32 from a SSA Value of type i64. Today, we would have to either hardcode this or teach our tool to look into TableGen definitions. With an IRDL definition file, we have an easy, introspectable way of looking into the dialect definition and finding which operations can do this. We can iterate over IRDL operation definitions and check the irdl.operands and irdl.results IR to find out which operation best suits our needs.
With the addition of the tblgen-to-irdl tool, users will now have the ability to generate IRDL files directly from TableGen files and use them for IR introspection.
The tool is implemented in the same way as mlir-tblgen, but instead of generating C++, it generates IRDL definitions and complements the existing TableGen Flow.
I think this is useful. Although this reminds me more of a transpiler in the sense that the output is probably not something that a person writing IRDL would have written.
Looking at the output it wouldn’t seem this is straight forward in the form generated by this tool today.
I see this as saying either we could have a tool that uses ODS definition (and walks the ODS/TableGen data structures) or have one that uses IRDL and default MLIR methods, and regular MLIR you find more friendly to work with than the alternative. Both you have to teach a tool to understand some parts of this (e.g., in TableGen, check for I64, in IRDL check for a c predicate with ($_self.isSignlessInteger(32))).
I’d say this could even just be a different backend/output format for mlir-tblgen, but haven’t looked at code yet/could be good in own tool.
Having looked at it for some time, it is definitely possible to convert this to a “proper” IRDL definition. One solution is to either teach the tool to understand constructs like ‘Or’, 'And, etc…, or to write regexes to handle these cases. I’ve had a previous tool that was essentially doing this.
I think the idea is that here, once we teach the tool what ($_self.isSignlessInteger(32))) means, we don’t have to do it again for any other tool that want to do this kind of introspection. Otherwise, if we have two tools that want to introspect ODS, I’m not sure how we could abstract this kind of information properly without IRDL.
+1. Thanks for the RFC. Having a tool that can convert existing dialects to IRDL will make it very easy for folks to leverage its features!
And +1 to Jacques’s point about teaching the tool more about MLIR methods over time, but I think that regex matching can have its limits. I think the natural thing to do here instead is to lift more concepts out from bare C++ in ODS and give them names, then teach both ODS and tblgen-to-irdl about them.
The reason I did not make this a backend for mlir-tblgen is that this would make mlir-tblgen depend on MLIRIR. I’m not sure how this dependency affects things as I expect the MLIRIR definitions to be generated by mlir-tblgen and mlir-tblgen would start depending on those definitions.
This is planned Making TableGen definitions more declarative makes for easier translations to IRDL as well as improved TableGen experience.