We have spent some effort putting into words the core design principles, tradeoffs and architectural decisions behind the Linalgdialect.
This document was accelerated in the past few days following Chris’ presentation and the proposal for the working group on Tensor Compute Primitives. There are still some omissions and cleanups required but at this point it is better to iterate in the open. This will hopefully give a deeper picture of the reasons why we have invested in this dialect since last April.
Special thanks go to @ftynse who proofread and reformulated large parts of the Introduction and Prior Art, and to Andrew Adams who kindly weighed in and corrected some of our lessons from Halide.
Summary
In his latest MLIR Open Design meeting presentation, Chris laid out a compelling vision for an MLIR-based codegen that would make the best use of the multi-level properties of the infrastructure and be driven by search. The MLIR Linalg dialect aims at bringing an answer to the higher-level codegen problem. This document describes the key design principles that led to the existing implementation of Linalg and aims at exposing the tradeoffs involved when building higher-level Intermediate Representations (IR) and Dialects to facilitate code generation. Linalg is designed to interoperate nicely within a Mixture Of Expert Compilers environment (i.e. the CGSel box).
To keep the discussion manageable, I would encourage people interested to post comments, questions and criticism like you would do commits: small and on one particular topic.
This way hopefully the discussion would not involve into spaghetti and will be easier to follow
I’l start with the hardest question in computer science: naming.
Given the evolution of the scope, it becomes apparent that a better name than “Linalg” could remove some of the confusions related to the dialect (and the underlying approach), its goals and limitations.
We turn to the community to please provide suggestions.
This @nicolasvasilache and others for the great doc! I really appreciate all the thoughts and rationale behind.
So regarding naming, I’m not good at it, but I feel it’s hard to come up with one word to describe so many concepts behind. So likely we need acronyms. Throwing out some candidates for others to bring up better ones.
Structured payload container ops (spc): emphasizes the structured op’s principle and echos the payload ops perspective.
Parallel domain innate ops (pdi): emphasizes the point that ops have loop iterators built in to the op structurally.
Nicolas, I really enjoyed this doc. I haven’t seen such a synthesis of prior art in this domain and I think you did a great job.
Let me summarize my understanding: The combinators (as I’ll call them) defined in linalg (linalg.generic/indexed_generic) have the property of lifting an inner computation (“compute payload”) on a smaller piece of data into an operation on an entire data structure (or multiple, if there are multiple arguments). This decouples the computation itself from the actual relationship between the subset of data and the entire data.
I think it would be cool to include some more examples. Off the top of my head, I’m curious about seeing sample IR for the following:
We have some examples as tests in the code, would it make sense to reference them and maybe add more descriptive comments? Having code in a doc creates a risk of bitrot…
Sorry for reviving this old thread. I came across it when searching for information on Linalg representation of softmax and layer normalization. This thread ended without links to operators listed by @_sean_silva. I’m wondering if links to Linalg representation of softmax and layer normalization are available somewhere? I’m particularly interested in if there is some way to represent softmax and layer normalization with one single linalg.generic operation. Thanks.
Thank you @_sean_silva for the pointers! I have one relevant question and hope you or someone else can kindly share some insight. The reason that I’m looking for a way to represent softmax or layernorm as a single Linalg operation is to understand what techniques are used by the Linalg community to capture patterns defined by complex operations to map to library calls. I like Linalg’s property of being able to capture a pattern and have it composable with Linalg transformations such as tiling, which makes it easier to keep a pattern stable in comparison with the additional care needed from a set of smaller Linalg operations. I searched the forum but didn’t find something conclusive so wonder what’s Linalg’s current take on pattern preservation?