[RFC]: a new tutorial: "MLIR for Beginners"

j2kun · April 10, 2024, 5:05pm

Introduction

Over the last year I’ve been gradually becoming “the MLIR guy” among a group of cryptography researchers, oriented around our HEIR project which is built on MLIR. Many folks in this community have told me they find MLIR difficult to learn, so I wrote a series of tutorials aimed at complete beginners—in the sense that the intended audience doesn’t know MLIR, LLVM, or much about compilers, but they do know how to write C++.

The feedback has been very positive, and so I’d like to propose upstreaming it. This RFC is to solicit feedback about the scope, how it should be structured in the monorepo (or not), and what additional topics should be included.

Background

Core emphasis

I wrote my tutorial as a more detailed and incremental version of the toy tutorial, with a heavier focus on the software development lifecycle: how to set up the C++/tablegen boilerplate, how to write lit tests, how to construct pipelines and what to do when seeing various common errors, and details specific to out-of-tree projects that use MLIR as a dependency (e.g., how to configure lit from scratch).

In this sense, the tutorial is very much aimed at out-of-tree users of MLIR rather than upstream contributors. I spend much time on basic questions like, “how do you run an upstream pass?” and, “what traits already exist and what do they do?”

Current tutorial outline

Build System (Getting Started)
- Brief history of MLIR
- Bazel build system tutorial
Running and Testing a Lowering
- Basic MLIR syntax
- Using the mlir-opt command line tool to run upstream passes
- lit, FileCheck, and mlir-cpu-runner
Writing Our First Pass
- Making a custom project-opt binary
- Writing a trivial pass without tablegen (walk the IR and call mlir::affine::loopUnrollFull on every affine.for op)
- Reimplementing the above using a rewrite pattern
- Writing a rewrite pattern that uses greedy engine nontrivially (“unroll” mul ops as iterated add ops)
Using Tablegen for Passes
- Reimplementing the unroll pass from (3) using tablegen
- Manually inspecting the generated C++
Defining a New Dialect
- High level discussion of what dialects are for
- Create an empty polynomial dialect shell in tablegen
- Defining types and ops in tablegen
- Custom assembly formats
- Adding a custom type attribute
Using Traits
- High level view of why traits/interfaces are useful (dialect-agnostic passes)
- A survey of all general upstream traits I could find
- Adding Pure and ElementwiseMappable to polynomial and seeing what upstream passes can operate on it as a result.
Folders and Constant Propagation
- A deeper dive on sccp
- Adding a ConstantLike op and folders to polynomial
Verifiers
- Studying traits that add verifiers
- Adding a custom verifier
- Adding a custom verifier using a custom trait
Canonicalizers and Declarative Rewrite Patterns
- Discussion of -canonicalize
- Adding canonicalization patterns in C++
- Rewriting the patterns to use DRR.
Dialect Conversion
- Discussion of why dialect conversion is hard (types) and existing conversion passes
- Lowering polynomial to standard MLIR.
- Discussion of unrealized_conversion_cast, type materialization hooks, and why this conversion pass doesn’t need them.
Lowering through LLVM
- Defining a pass pipeline
- Lowering poly to LLVM (and a sort of backwards way of figuring out what passes to run)
- Bufferization
- mlir-translate --mlir-to-llvmir -> llc -> clang -> ./a.out -> FileCheck for a full e2e test.
A Global Optimization and Dataflow Analysis
- Analysis passes & overview of what dataflow analysis does
- The IntegerRangeAnalysis and reusing it with custom types
- A global optimization that uses the int range analysis to set up an ILP, solve it, and insert new ops into the IR.

Deficiencies/quirks of current tutorial

Bias toward HEIR’s problems

The tutorial series was intended to be a ramp for people who want to contribute to the HEIR project, and happens to be general enough that non-cryptographers find it useful. As such, there are various aspects of the tutorial that are biased toward HEIR that we may not want to focus on upstream. In particular:

The use of bazel as the build system (the tutorial does have a CMake build alongside bazel, but bazel is the “primary” build system and I suspect the CMake config in the tutorial could be greatly improved)
The choices of rewrite patterns are unrealistic for most compilers, but not all too unrealistic for FHE.
The polynomial dialect’s custom type/attribute is a bit heavy for an introduction. I double dipped here, using this as a way to study/bootstrap an early version of a more fully-functional polynomial dialect that I’m working on upstreaming. Also polynomial ring math might be too intimidating for the average MLIR newbie.
The global optimization article is a direct port of an academic paper relevant to HEIR.
My lack of knowledge of the internal design of MLIR (e.g., how an op is laid out in memory) shows through in some places.

Sequential organization

The tutorials are organized sequentially, in that each article corresponds to a single pull request in a GitHub project, and the different sections of each article link to specific commits. The commits are organized in such a way that they can be read easily in order. E.g., one commit might set up pure boilerplate and ensure a pass with a no-op body can build, then the next commit might add a naive implementation of the pass, then the next commit improves on the naive implementation. In between, the article shows particular inputs, outputs, and error messages so that the reader can reproduce them at any point.

This poses a challenge for long-term maintenance of a tutorial kept in sync with HEAD, because earlier commits cannot be retroactively updated to account for API incompatibilities introduced in later commits. And the process of updating intermediate input/output/messages would be infeasible.

My tutorial gets around this by pinning to a particular LLVM commit hash, and twice in the tutorial series I show the process of updating the hash and fixing what breaks. I personally think this “commit-by-commit” style is helpful, but I don’t have any data to support that readers are actually relying on this, or if they just go read the code at HEAD. I’m open to suggestions for how to square this circle with an upstream tutorial, but without any solution I will upstream a version of the tutorial that gives up on this style.

Avoidance of MLIR internals

I explicitly avoided discussing internal details of MLIR, except in places where it was necessary (like how dialect conversion works). While I think much of this tutorial is best framed with MLIR as an opaque API, there are surely some parts that would benefit from side-information about MLIR internals. I simply don’t know enough about MLIR to know where those places are and what information would be useful there. I’m looking to the community for help there.

Not-yet-covered topics

I had a list of additional topics I wanted to cover in more detail, such as

Defining/working with region-holding ops
Custom dataflow analyses
Slices
Defining passes that depend only on interfaces/traits
PDLL

Open to other suggestions.

Proposal

I will incrementally start to upstream the tutorial articles with the following modifications:

Use CMake as the primary build system, so that it can be part of the upstream test suite like toy.
Add one tutorial (that is not part of the build) that shows how to use bazel for an out-of-tree project.
Modify the tutorial to link to lines of code at HEAD, in lieu of linking to commits/PRs.
Use a GH-actions-based alerting mechanism to keep the links in sync with the code (maybe “if this then that” directives? Does LLVM have something like this configured upstream?)
Use a dialect that is not polynomial for the intermediate tutorials, since polynomial will be upstreamed to MLIR and conflict. Open to suggestions.

I will keep a GH issue tracking the remaining work to be done.

Alternatives

Keep the tutorial as is (owned by me), but link to it from the MLIR tutorials page.
The tutorial lives as a repository owned by the llvm GH org, but still out of the monorepo. This can maintain the commit-by-commit style, and pin to a particular LLVM commit, while allowing us to update it to more recent LLVM commits as needed. I would reimplement the tutorials commit-by-commit (PR-by-PR) in the new location, to give folks a chance to review the code and prose and make suggestions for improvements, rather than just changing ownership of the existing tutorial in-place.

clattner · April 11, 2024, 2:59am

This is really amazing work!

scotttodd · April 11, 2024, 11:36pm

Nice! Do you see these new tutorial pages living alongside Tutorials - MLIR or merging into them? I imagine there is some overlap, but these new pages could be using more recent code patterns.

Keep the tutorial as is (owned by me), but link to it from the MLIR tutorials page.

This seems like a pretty low cost approach to start with, at least up to the point where community members want to make contributions. We could add links to downstream learning resources similar to how Users of MLIR - MLIR lists various projects.

Getting into some of the details…

Looks like you have links such as https://github.com/llvm/llvm-project/blob/9654bc3960c460bd9d8b06cfa4cfe0e52c6582bd/mlir/include/mlir/IR/PatternMatch.h#L356 pointing at class definitions in the middle of source files. In context, that link is on this page:

A rewrite pattern is a subclass of OpRewritePattern, and it has a method called matchAndRewrite which performs the transformation.

I’m not aware of “if this then that” style comments in open source projects like LLVM, I’ve only seen that tooling at Google. If you want the links to be durable, maybe link to doxygen pages like https://mlir.llvm.org/doxygen/structmlir_1_1OpRewritePattern.html , which then link to the source (though not on GitHub):

Definition at line 357 of file PatternMatch.h.

Another trick for links I used at Google was linking into a file at HEAD with a search query in the URL like &q=OpRewritePattern. Code Search would then scroll to the right place in the file without being pinned to an older CL/commit. I’m not sure if GitHub supports anything like that.

j2kun · April 12, 2024, 5:30am

Linking to the doxygen page is a good idea!

The “if this then that” idea is also kind of a hail mary: I could imagine that if it pollutes the codebase too much, maintainers will revolt because they don’t want a refactor to be blocked by updating the prose of a tutorial.

mehdi_amini · April 12, 2024, 8:50am

This is really excellent content! Thanks for taking the time of doing this.

It seems to me that it is perfectly suitable to organize this as an evolution of the Toy tutorial, there is quite a large overlap but you seem to also have a lot of complementary content. Can we upstream content that way as Toy v2 tutorial instead?

j2kun:

This poses a challenge for long-term maintenance of a tutorial kept in sync with HEAD, because earlier commits cannot be retroactively updated to account for API incompatibilities introduced in later commits. And the process of updating intermediate input/output/messages would be infeasible.

My tutorial gets around this by pinning to a particular LLVM commit hash, and twice in the tutorial series I show the process of updating the hash and fixing what breaks. I personally think this “commit-by-commit” style is helpful, but I don’t have any data to support that readers are actually relying on this, or if they just go read the code at HEAD. I’m open to suggestions for how to square this circle with an upstream tutorial, but without any solution I will upstream a version of the tutorial that gives up on this style.

The way I handled this with the Toy tutorial has been to duplicate the code entirely in-tree, so that each incremental version is continuously built/tested.
The user does not have a commit to see the diff for each chapter, but they can still diff the two folder to get the same effect.

j2kun · April 18, 2024, 10:37am

On “toy v2,” I wonder to what extent the original tutorial structure should be preserved. For example, having a front-end and AST from article 1, versus starting from using the standard MLIR dialects.

ftynse · April 18, 2024, 10:42am

Strong +1 for me to start from upstream dialects and leave AST to something like an appendix or some “advanced topic” late chapters. I’ve seen way too many people being excessively focused on the toy language, AST and parsing, and have lost count of the number of questions “how to extend toy to support broadcasting/bf16/numpy-whatever”.

mehdi_amini · April 18, 2024, 1:29pm

I am not convinced here: Toy intentionally does not spend time on the grammar and the AST, they are provided as a given.
The whole point is to ground the tutorial into something concrete to get started with a motivating use-case, instead of starting with something that will be totally foreign to a beginner.

ftynse · April 18, 2024, 1:36pm

They may well be provided as a given, people seem to think they have to at least understand them before proceeding to further chapters. Not spending time in the tutorial text on them may, on the contrary, force people spend more time to try and understand these things.

I agree that the tutorial should be grounded in something concrete. I disagree that something concrete is a programming language. Most MLIR users are not building a new language AFAIK. AST can be an equally foreign concept to them! Let’s start with something like matmul pseudocode.

Topic		Replies	Views
[RFC] Starting in-tree development of python bindings MLIR	19	1927	July 11, 2020
MLIR News, 24th edition (1/8/2021) Newsletter	0	1267	December 27, 2020
MLIR News, 2nd edition (3/6/2020) Newsletter	0	1959	February 23, 2020
MLIR News, 48th edition (7th June 2023) Newsletter	0	1334	May 31, 2023
MLIR News, 5th edition (4/17/2020) Newsletter	0	1254	April 5, 2020