I’ll be giving a talk on maximizing the utility of TableGen in MLIR, in this year’s LLVM US dev meeting. The premise of the talk is to explore the following broad benefits of writing richer tablegen descriptions for your ops/types/attributes etc. via concrete examples -
Clearer compiler domain and feature-set.
More apparent and robust compiler behavior.
Reduced mental overhead on compiler developers.
Lower barrier to entry for new contributors.
Better developer tools.
While I had specific ideas in mind, I wanted to take a survey of things people might find interesting, under this topic. So, how do YOU think we can better utilize TableGen in MLIR?
P.S. - I’m aware that Tablegen is a tool that the LLVM community has a lot of strong opinions on. I’m not looking to discuss whether/if/how TableGen is being abused beyond what it was initially designed to do. Reason being that the general premise of my talk and the specific arguments it is trying to make would be applicable to any IDL that might be used in place of TableGen in MLIR.
Defining pipelines (VERY in-progress draft Add pipeline definitions to mlir-tblgen by j2kun · Pull Request #9 · j2kun/llvm-project · GitHub ). Besides defining pipelines as a sequence of passes (or sub-pipelines), it could propagate options from the pipeline to each pass (which is a lot of boilerplate today) and move us toward a better ideal of exposing pipelines with more coherent before/after properties than is easy to do with individual passes. one-shot-bufferize seems like a good first candidate here, since currently one has to do it manually from this diagram (which results in code like this in every out-of-tree project that is probably wrong or stale because I don’t actually know much about the details of bufferization)
To double check, is this limited to ODS or tablegen proper features too? (E.g., I would like namespaces and inline defs, both of which would improve things ODS side, but are primarily language features).
@j2kun Thank you for sharing your ideas! My line of thinking so far has been about doing more with what we’ve got, but I like how you’re thinking about enrichment via new features.
I especially like the idea of moving pipeline definitions to tablegen because, as you mentioned, there’s tons of boilerplate involved and the resulting code isn’t particularly readable. Additionally, I think it might be at least somewhat valuable to have such things defined in a structured format if one wants to create tools/systems (lsp servers, formal verifiers etc.) for the compiler.
As an aside, would you like any help with the tablegen pipelines thing?
I’m thinking certain kind of records can already be defined with namespaces (eg - dialects) and inline defs (eg - interfaces, ops etc. which can have inline method definitions). Sounds like that might not be what you’re alluding to though?
I’ll be glad to send you my draft when it’s in a decent state. I only just started studying mlir-tblgen so I’m still figuring out how to structure it well.
Its about in the descriptions, not in what’s generated from them. Namespaces are purely textual in tablegen. E.g., Shape_BroadcastOp is “textually namespaced”, but that name is a unique identifier for all of tablegen files included. So if multiple folks are using the same concept then they have to manually prefix with a unique identifier. Many do, but not all - I have considered a lint mode to ODS which would flag if you don’t have a prefix. Its a bit crude though. Now many folks keep these all separate, but if one is doing declarative rewrites or want to uniquely create external bindings, then this unique names are very useful. But all manual.
Similarly here I mean defs as in TableGen side, not as in what’s generated. For some of these (I had an example a while ago, but not on hand) you want a def (in tablegen sense) but that has to happen before the other def, which effectively ends up just expanding scope of definition/occupying space in global name space.
One that we have discussed here: a boilerplate generator. In my mind it almost looks like the old DOS text wizzards Along with that, couple of template rules cmake/bazel and structure conventions would go along way.
This is something I’ve been discussing with another group in one small slice: error messages. Being able to generate better ones while also aiming for being more declarative (perhaps some more established helper functions, teach ODS about some of these so that it can generate better errors)
I think a “standard declaration” library (parameterized) could help with feature discovery, error messages and onboarding.
On thing I’d like would be to be able to write the MLIR example for op descriptions so that they generate tests that we run, and so that they don’t bitrot and the doc stays up to date.