Survey: Interested in discussing richer TableGen descriptions in MLIR?

jkshtj · August 16, 2025, 5:44pm

Hi all,

I’ll be giving a talk on maximizing the utility of TableGen in MLIR, in this year’s LLVM US dev meeting. The premise of the talk is to explore the following broad benefits of writing richer tablegen descriptions for your ops/types/attributes etc. via concrete examples -

Clearer compiler domain and feature-set.
More apparent and robust compiler behavior.
Reduced mental overhead on compiler developers.
Lower barrier to entry for new contributors.
Better developer tools.

While I had specific ideas in mind, I wanted to take a survey of things people might find interesting, under this topic. So, how do YOU think we can better utilize TableGen in MLIR?

P.S. - I’m aware that Tablegen is a tool that the LLVM community has a lot of strong opinions on. I’m not looking to discuss whether/if/how TableGen is being abused beyond what it was initially designed to do. Reason being that the general premise of my talk and the specific arguments it is trying to make would be applicable to any IDL that might be used in place of TableGen in MLIR.

j2kun · August 16, 2025, 7:59pm

I have two ideas, which I’ve worked on in an out of tree project:

Having descriptions for passes include references to lit tests as examples, which then get rendered at doc generation time by actually running them with a given pass and displaying the before/after (Add script to generate documentation examples from lit tests by j2kun · Pull Request #1987 · google/heir · GitHub)
Defining pipelines (VERY in-progress draft Add pipeline definitions to mlir-tblgen by j2kun · Pull Request #9 · j2kun/llvm-project · GitHub ). Besides defining pipelines as a sequence of passes (or sub-pipelines), it could propagate options from the pipeline to each pass (which is a lot of boilerplate today) and move us toward a better ideal of exposing pipelines with more coherent before/after properties than is easy to do with individual passes. one-shot-bufferize seems like a good first candidate here, since currently one has to do it manually from this diagram (which results in code like this in every out-of-tree project that is probably wrong or stale because I don’t actually know much about the details of bufferization)

jpienaar · August 17, 2025, 5:37am

To double check, is this limited to ODS or tablegen proper features too? (E.g., I would like namespaces and inline defs, both of which would improve things ODS side, but are primarily language features).

jkshtj · August 18, 2025, 12:33am

@j2kun Thank you for sharing your ideas! My line of thinking so far has been about doing more with what we’ve got, but I like how you’re thinking about enrichment via new features.

I especially like the idea of moving pipeline definitions to tablegen because, as you mentioned, there’s tons of boilerplate involved and the resulting code isn’t particularly readable. Additionally, I think it might be at least somewhat valuable to have such things defined in a structured format if one wants to create tools/systems (lsp servers, formal verifiers etc.) for the compiler.

As an aside, would you like any help with the tablegen pipelines thing?

jkshtj · August 18, 2025, 12:43am

Could you elaborate a bit further?

I’m thinking certain kind of records can already be defined with namespaces (eg - dialects) and inline defs (eg - interfaces, ops etc. which can have inline method definitions). Sounds like that might not be what you’re alluding to though?

j2kun · August 18, 2025, 3:16am

I’ll be glad to send you my draft when it’s in a decent state. I only just started studying mlir-tblgen so I’m still figuring out how to structure it well.

jpienaar · August 20, 2025, 6:42am

Its about in the descriptions, not in what’s generated from them. Namespaces are purely textual in tablegen. E.g., Shape_BroadcastOp is “textually namespaced”, but that name is a unique identifier for all of tablegen files included. So if multiple folks are using the same concept then they have to manually prefix with a unique identifier. Many do, but not all - I have considered a lint mode to ODS which would flag if you don’t have a prefix. Its a bit crude though. Now many folks keep these all separate, but if one is doing declarative rewrites or want to uniquely create external bindings, then this unique names are very useful. But all manual.

Similarly here I mean defs as in TableGen side, not as in what’s generated. For some of these (I had an example a while ago, but not on hand) you want a def (in tablegen sense) but that has to happen before the other def, which effectively ends up just expanding scope of definition/occupying space in global name space.

One that we have discussed here: a boilerplate generator. In my mind it almost looks like the old DOS text wizzards Along with that, couple of template rules cmake/bazel and structure conventions would go along way.

This is something I’ve been discussing with another group in one small slice: error messages. Being able to generate better ones while also aiming for being more declarative (perhaps some more established helper functions, teach ODS about some of these so that it can generate better errors)

I think a “standard declaration” library (parameterized) could help with feature discovery, error messages and onboarding.

mehdi_amini · August 20, 2025, 8:59am

On thing I’d like would be to be able to write the MLIR example for op descriptions so that they generate tests that we run, and so that they don’t bitrot and the doc stays up to date.

j2kun · August 21, 2025, 1:51pm

Maybe something like what I did in Add script to generate documentation examples from lit tests by j2kun · Pull Request #1987 · google/heir · GitHub ? It doesn’t generate the tests exactly, but includes the test in the docs.

jkshtj · September 11, 2025, 9:01am

@mehdi_amini @j2kun I liked this idea, so I took a shot at a prototype. I’ve put up a pull request for the same if you guys can take a look.

github.com/llvm/llvm-project

[mlir][tblgen] Adds support for embedded LIT tests in TableGen records

main ← jkshtj:jkshtj/tblgen-embedded-lit-tests

opened 08:59AM - 11 Sep 25 UTC

jkshtj

+379 -1

Introduces a new Testable base class that allows TableGen records (starting with… Pass records) to embed LIT test definitions directly within their definitions. This enables co-locating tests with pass definitions for better maintainability. Key components: - Testable.td: Base class for records that can have embedded tests - LitTestGen.cpp: TableGen backend to extract and generate LIT test files - AddMLIR.cmake: CMake function to process embedded tests with usage examples - PassBase.td: Updated Pass class to extend Testable Usage example in CMake: ``` add_embedded_lit_tests( MyPassesEmbeddedTests ${CMAKE_CURRENT_SOURCE_DIR}/include/MyPasses.td ${CMAKE_CURRENT_SOURCE_DIR}/test/Passes/ ) # Add LIT test generation target as a dependency to some other target add_library(someLib DEPENDS MyPassesEmbeddedTests) ```

Topic		Replies	Views
Porting LLVM backend is no fun yet LLVM Dev List Archives	11	110	April 14, 2009
New TableGen document LLVM Dev List Archives	0	107	February 5, 2004
Summary of TableNextGen BOF LLVM Dev List Archives	3	103	December 12, 2013
Proposal for TableML, llvmc2 configuration language LLVM Dev List Archives	3	153	November 29, 2008
TableML status LLVM Dev List Archives	1	118	December 17, 2008

Survey: Interested in discussing richer TableGen descriptions in MLIR?

Related topics