MLIR News, 58th edition (5th December 2023)

Welcome to the 58th issue of the MLIR Newsletter covering developments in MLIR, and related projects in the ecosystem. We welcome your contributions (contact: javed.absar@gmail.com). Click here to see previous editions.

Highlights and Ecosystem:

  • Alexander Viand from Intel, and Jeremy Kun who works on Homomorphic Encryption at Google presented MLIR ODM Polynomial Dialect. At a high level, poly will contain a type representing a polynomial, attributes specifying the domain of the polynomials (e.g., integer or real coefficients), and ops representing addition, multiplication, division/remainder, polynomial evaluation, etc. The central goal of upstreaming poly (and an associated set of lowering paths to standard MLIR) is to make programs with polynomial-heavy computations fast. They expect it to be useful for applications to cryptographic compilers and potentially scientific computing applications. [slides, recording] [RFC].

  • Guray Ozen from Google Research presented developments within the NVGPU and NVVM dialects to target NVIDIA H100.The focus was on illustrating the implementation of key elements such as the Tensor Memory Accelerator (TMA), warp-group level tensor core instructions, and transactional barriers. [click here].

  • Save the date for the 2024 EuroLLVM Developers’ Meeting! It will be held April 9-11 at the Marriott in Vienna, Austria.

MLIR Commits Past Two Weeks:

  • Following the [RFC][Tensor] Add a `tensor.concatenate` operation, Quinn Dawkins added tensor.concat for ranked tensors along a static dimension, as well as a decomposition mirroring the existing lowering from TOSA to Tensor. This offers a convergence point for “input” like dialects that include various lowerings for concatenation operations, easing later analysis. In the future, this op can implement the necessary interfaces for tiling, as well as potentially add conversions to some kind of linalg and/or memref counterpart. This follows further on path [RFC] Structured Codegen Beyond Rectangular Arrays

  • Matthias put fix in Transform interfaces for a bug report filed by Mahesh. The check was trying to find cases of invalid API usage: incorrect/missing handle side effects and/or incorrect rewriter usage. This check is not implemented correctly and can report false positives in case of pointer reuse (different op created at same location). It is unclear if such a check can be implemented given that we have both tracking listener-based handle updates and handle consumption.

  • Boian added ops in Mesh dialect. The mesh dialect contains a set of attributes, operations, interfaces and transformations that are useful for representing and optimization the computation on device mesh.RFC and diff here.

  • Hideto refactored verifyDominanceOfContainedRegions to iterative
    algorithms similar to here to fix stackoverflow for deeply nested regions. click here

  • Matthias fixed a bug in SplitDeallocWhenNotAliasingAnyOther. This pattern used to generate invalid IR (op dominance error). We never noticed this bug in existing test cases because other patterns and/or foldings were applied afterwards and those rewrites “fixed up” the IR again. (The bug is visible when running mlir-opt -debug.) Also add additional comments to the implementation and simplify the code a bit. Apart from the fixed dominance error, this change is NFC. Without this change, buffer deallocation tests will fail when running with #74270. click here

  • Add support for [vector.maskedstore sub-type emulation].

  • Han-Chung added landed Diff. The idea is similar to vector.maskedload + vector.store emulation. What the emulation does is: 1. Get a compressed mask and load the data from destination. 2. Bitcast the data to original vector type. 3. Select values between op.valueToStore and the data from load using original mask. 4. Bitcast the new value and store it to destination using compressed masked.

  • Andrzej added a rewrite pattern for gather over a strided memref Fixes [iree problem] . [commit here].

  • Alex fixed fixed LLVM type converter for structs (#73231). Existing implementation of the LLVM type converter for LLVM structs containing incompatible types was attempting to change identifiers of the struct in case of name clash post-conversion (all identified structs have different names post-conversion since one cannot change the body of the struct once initialized). Beyond a trivial error of not updating the counter in renaming, this approach was broken for recursive structs that can’t be made aware of the renaming and would use the pre-existing struct with clashing name instead.[click here].

  • Do not peel already peeled loops! Loop peeling is not beneficial if the step size already divides “ub-lb”. There are currently some simple checks to prevent peeling in such cases when lb, ub, step are constants. This commit adds support for IR that is the result of loop peeling in the general case; i.e., lb, ub, step do not necessarily have to be constants.[click here].

  • This patch adds handling of an empty MaskOp to MaskOpRewritePattern and thereby fixes a crash. It also pulls the MaskOp canonicalization patterns into LowerVectorMask so that empty MaskOps are folded away in the Pass. Fix #71036. [click here].

MLIR RFC Discussion Topics

MLIR Ecosystem

Useful Links

3 Likes