MLIR News, 63rd edition (6th March 2024)

javedabsar · March 6, 2024, 11:37pm

Welcome to the 63rd issue of the MLIR Newsletter covering developments in MLIR, and related projects in the ecosystem. We welcome your contributions (contact: javed.absar@gmail.com). Click here to see previous editions.

Highlights and Ecosystem:

Euro LLVM 2024 Program is now online - [accepted presentations].
Call for proposals for MLIR workshop at euro-llvm conference: CFP: MLIR Workshop at the EuroLLVM Developer Meeting (Apr 9, 2024)
Until now, op replacements and block argument replacements were kept track in separate data structures inside the dialect conversion. This commit turns them into IRRewrites, so that they can be committed or rolled back just like any other rewrite. This simplifies the internal state of the dialect conversion. [click here].
Open design meeting / RFC Adding support for OpenMP GPU target offloading. [video].
Aart Bik added New sparse_tensor.print operation. " Printing the constituents of a sparse tensor (rather than the reconstructed dense tensor) is an extremely valuable tool for debugging and testing. I had various solutions for this in my local workspace, but decided to make these more generally available in PR 83321. It is now possible to simply call sparse_tensor.print %a : tensor<4x8xf32, #CSR> to print the individual components to stdout. The implementation actually lowers to my old friend vector.print under the same philosophy of just requiring a very light-weight runtime support library for I/O which keeps migrating to different platforms simple."

MLIR Commits Recently:

Linalg canonicalization - “Transposing a filled tensor is the same as filling the destination of the transpose” [diff].
Matthias Springer, “… this commit renames replaceOpWithIf to replaceUsesWithIf.
replaceOpWithIf was a misnomer because the function never erases the original op. Similarly, replaceOpWithinBlock is renamed to replaceUsesWithinBlock”. (No “operation replaced” is sent because the op is not erased. [click here].
Guray Ozen added GEMM Hopper Tensor Core Integration Test (#81478).
This commit adds a new ConversionConfig struct that allows users to customize the dialect conversion. This configuration is similar to GreedyRewriteConfig for the greedy pattern rewrite driver. [click here].
Quinn Dawkins added support for expansion of named linalg ops and linalg ops with reduction iterators. This improves the ability to make fusion decisions WRT reduction operations. To recover the previous behavior, users of the patterns can add a control function to restrict propagation of reshape by expansion through linalg ops with reduction iterators. [click here].
Uday B., “Affine dialect …isContiguousAccess is an important affine analysis utility but is only tested very indirectly via passes like vectorization and is not exposed. Expose it and add a test pass for it that’ll make it easier/feasible to write test cases. This is especially needed since the utility can be significantly enhanced in power, and we need a test pass to exercise it directly. This pass can in the future be used to test the utility of invariant accesses as well”. [diff].
Han-Chunf, "The reverse op is treated as a VectorMemoryAccessKind::Contiguous load. It is contiguous slice, but we’ll need to compute indices differently and apply a reverse at vector level. It takes non-trivial efforts for the approach. The revision flips the case to use vector.gather. Otherwise there are functionality issues. E.g., the below example loaded 2, 3, 4 (which is a bug), but what we want is 2, 1, 0. [click here].
When splitting a block during a dialect conversion, a SplitBlockRewrite object is stored in the dialect conversion state. This commit removes SplitBlockRewrite. Instead, a combination of CreateBlockRewrite and multiple MoveOperationRewrite is used. This change simplifies the internal state of the dialect conversion and is also needed to properly support listeners. RewriteBase::splitBlock is now no longer virtual. All necessary information for committing/rolling back a split block rewrite can be deduced from Listener::notifyBlockInserted and Listener::notifyOperationInserted` (which is also called when moving an operation). [click here].
[mlir][AMDGPU] Set uniform-work-group-size=true by default (#79077) GPU kernels generated via typical MLIR mechanisms make the assumption that all workgroups are of uniform size, and so, as in OpenMP, it is appropriate to set the “uniform-work-group-size”=“true” attribute on these functions by default. This commit makes that choice. [click here]]([mlir][AMDGPU] Set uniform-work-group-size=true by default (#79077) · llvm/llvm-project@563f414 · GitHub).

Related Projects

Triton community meeting - https://www.youtube.com/watch?v=uRlqolhNbRk
IREE community meeting - https://www.youtube.com/watch?v=b779to--7es
OpenXLA community meeting - https://www.youtube.com/watch?v=YK1CLzIcsJ8&t=2s

Useful Links

Topic	Replies	Views
MLIR News, 59th edition (20th December 2023) Newsletter llvm-weekly	511	December 18, 2023
MLIR News, 43rd edition (9/18 - 10/1/2021) Newsletter	792	September 24, 2021
MLIR News, 74rd Edition (1st March 2025) Newsletter llvm-weekly	212	March 1, 2025
MLIR News, 64th edition (15th April 2024) Newsletter llvm-weekly	517	April 14, 2024
MLIR News, 35th edition (5/29 - 6/12/2021) Newsletter	961	June 1, 2021

MLIR News, 63rd edition (6th March 2024)

Related topics