MLIR News, 66th edition (27th May 2024)

javedabsar · May 27, 2024, 12:00pm

Welcome to the 66th issue of the MLIR Newsletter covering developments in MLIR, and related projects in the ecosystem. We welcome your contributions (contact: javed.absar@gmail.com). Click here to see previous editions.

Highlights, Discussions and Ecosystem:

Matthias Springer scripted an RFC to - Design a new (additional) dialect conversion driver without the quirks of the current dialect conversion driver. Why? What’s wrong with the current Dialect Conversion? The proposal [RFC] A New "One-Shot" Dialect Conversion Driver answers those questions, and provides (proposes)… “a new, simpler dialect conversion driver. (In addition to the existing dialect conversion driver. No changes needed for existing users of the current dialect conversion driver, if they do not want to use the new driver).”
Renato triggered discussions on deallocation, “I noticed that the ownership based deallocation pass adds the bufferization.dealloc op at the end of the scope, and thus moves the deallocations for all buffers in the context block to the end. This is safe to do, however it also keeps buffers alive when they’re long dead and can make larger models run out of memory.” [click here].
Sheen re-initiated discussion on – How to extend linalg ops without modifying MLIR code?. e.g. " linalg.softmax example: I want to tile and fuse the softmax and its producers and consumers directly, and treat the softmax as a whole, ignoring its actual computation". Some interesting points in the discussions, “…so far this has been points made in random discussions, forum and conferences. We’ve been dancing around the idea for over a year, and most people feel sympathetic to it, but not quite to create one.” The main controversies around the topic are: A: Is it isolated from above or not? If it is, then why not just outline into a function?; B: Why not just use a generic? For non perfectly nested ops, can we even safely define semantics?; C : Does it propagate interfaces through? Should it require that all ops have the same interfaces?; D: If all ops implement an interface, does that mean the group does, too? Are all interfaces composable that way? If not, how do we describe composability?; E: Can I merge any two groups together? Split them apart?
Menookar proposes a compile-time optimization on existing memref.alloc to reduce memory usage and improve memory locality. [RFC] Compile-time memref.alloc Scheduling/Merging optimization. The RFC “… proposes an optimization to consolidate multiple allocations (memref.alloc ops) into a single memref.alloc op and each static-shaped memref.alloc op will be transformed into a “slice” from the single allocated buffer with memref.view and some compile-time decided offsets . This optimization works on memref instead of tensor ops, so it should be executed after bufferization pass, and before buffer-deallocation.”
[RFC] New dialect to expose handy utilities.
Discussions on the [RFC]Add op for semantics of nullptr in memref dialect. Alex, " It would be nice to take this opportunity to better specify the semantics of null memrefs. I suppose accessing any element in such a memref should be treated as undefined behavior. Do we want the load to result in poison? Separately, what should the lowering of memref.null be? Memref constructed from a null pointer, but what offset/sizes/strides?"
[RFC] Allow symbol references to escape from SymbolTable
Discussions around semantics of named-ops - theory (what it should be) and practice (what is currently implicitly implied via implementation) - Notes from the MLIR Upstream Round Table @ EuroLLVM 2024

MLIR Commits Recently:

Mogball added - [mlir][ods] Optimize FoldAdaptor constructor (#93219) · llvm/llvm-project@264aaa5 · GitHub
Mubashar added - [mlir][vector] Add deinterleave operation to vector dialect (#92409) · llvm/llvm-project@bf4d99e · GitHub
Aart Bik, “This is a proof of concept recognition of the most basic forms of ReLu
operations, used to show-case sparsification of end-to-end PyTorch
models. In the long run, we must avoid lowering such constructs too
early (with this need for raising them back).” [click here].
Renato added more linalg named ops [click here].
Javed added a patch to identify linalg named ops such as linalg.fill/elementwise, towards a more comprehensive linalg.generic specialization. [click here].
Kunwar added support for reducing operations with multiple results
using PartialReductionOpInterface. Also added an implementation of
PartialReductionOpInterface for multiple results for linalg.generic. [click here].
Adam added fold pack and unpack of empty input tensor (#92247).
Felix added a patch to include the “no signed wrap” and “no unsigned wrap” flags, which can be used to annotate some Ops in the arith dialect and also in LLVMIR, in the integer range inference. [click here].
Christian consolidated the different topological sort utilities into one place. It adds them to the analysis folder because the SliceAnalysis uses some of these. [click here].
Spencer implemented folding and rewrite logic to eliminate no-op tensor and memref operations. This handles two specific cases: A: tensor.insert_slice operations where the size of the inserted sliceis known to be 0; B:memref.copy operations where either the source or target memrefs are known to be emtpy. [click here].
Jeremy added simple canonicalization rules to the polynomial dialect. Mainly to get the boilerplate incorporated before more substantial canonicalization patterns are added. [click here].
Matthias added a commit to turn the 1:N dialect conversion pattern for function signatures into a pattern for FunctionOpInterface. This is similar to the interface-based pattern that is provided with the 1:1 dialect conversion (populateFunctionOpInterfaceTypeConversionPattern). No change in functionality apart from supporting all FunctionOpInterface ops and not just func::FuncOp[click here].
Andrzej split the TransposeOpLowering into two patterns: 1. Transpose2DWithUnitDimToShapeCast - rewrites 2D vector.transpose as vector.shape_cast (there has to be at least one unit dim); 2. TransposeOpLowering- the original pattern without the part extracted intoTranspose2DWithUnitDimToShapeCast`. [click here].
Benoit added Vector-dialect interleave-to-shuffle pattern, enable in VectorToSPIRV. [click here].
Corentin created an expansion pattern to lower math.rsqrt(x) into
fdiv(1, sqrt(x)). [click here].
Adam, “packs a matmul MxNxK operation into 4D blocked layout. Any present batch dimensions remain unchanged and the result is unpacked back to the original layout.” – [click here]]([mlir][linalg] Block pack matmul pass (#89782) · llvm/llvm-project@4c3db25 · GitHub).

Related Projects

Triton community meeting - https://www.youtube.com/watch?v=uRlqolhNbRk
IREE community meeting - https://www.youtube.com/watch?v=b779to--7es
OpenXLA community meeting - https://www.youtube.com/watch?v=YK1CLzIcsJ8&t=2s

Useful Links

Topic	Replies	Views
MLIR News 71st Edition (3rd Nov 2024) Newsletter llvm-weekly	326	November 3, 2024
MLIR News, 53th edition (16th August 2023) Newsletter llvm-weekly	842	August 16, 2023
MLIR News, 61st edition (28th Jan 2024) Newsletter llvm-weekly	512	January 27, 2024
MLIR News, 68th edition (16th July 2024) Newsletter llvm-weekly	449	July 8, 2024
MLIR News, 55th edition (13th September 2023) Newsletter llvm-weekly	658	September 10, 2023

MLIR News, 66th edition (27th May 2024)

Related topics