MLIR News, 29th edition (3/6 - 3/19/2021)

See the previous published edition.
Welcome to the twenty-nineth issue of the MLIR (bi)Weekly, a newsletter covering developments in MLIR, and related projects in the ecosystem. MLIR (bi)Weekly is brought to you by a collective effort of contributors, we welcome your contributions!




Table-driven Infrastructure

  • Operation Asm Format: Support for “else” groups is being added to optional elements. This allows for specifying a group of elements to parse/print when an optional group is not present.

CPU codegen

  • A new AMX vector dialect for Intel Advanced Matrix Extensions has been added
    • unleashes the power of AMX using MLIR concepts (2-d vectors, memrefs, etc.) with just a few new operations
    • includes fully functional integration tests (running on a Sapphire Rapids emulator) which test correctness but also document usage
  • Sparse compiler: stress testing continues, but ran clean for millions of test without finding new issues; some more discussion of adding sparse tensor type, work will start shortly.


  • The SPIR-V dialect sees more ops for Vulkan graphics: spv.Image.
  • A few more patches landed into the SPIR-V dialect to improve op naming consistency.


In the Ecosystem

IREE : An Experimental MLIR Execution Environment

  • CUDA backend
    • Enabled CUDA E2E tests in IREE CI for several HLO ops
    • Wrote documentation about design choices for the CUDA backend and how to run CUDA E2E
    • Starting codegen improvements by adding tiling and distribution to blocks for element-wise ops

mlir-npcomp: Prototype for compiling Numpy programs

TensorFlow / MLIR-HLO

XLA GPU backend

  • Migrated While and Conditional ops emitter to use LMHLO operations.
  • We can now instantiated a full LMHLO graph from an XLA computation and use it to emit LLVM.

Kernel Generator

  • We fixed the handling of return values for c-wrappers to also support returning memrefs (or anything else that lowers to a struct type in LLVM). Returning structs is not well defined, so instead we now pass a pointer to the result struct as first argument.
  • We improved broadcast elimination for partially static shapes, enabling more fusion as a result.
  • Fixed a precision issue with the approximation for tanh.
  • Ongoing work in the area of auto-vectorizing at the MLIR level and enabling fusion with broadcasts and dynamic shapes.

CIRCT : Circuit IR Compilers and Tools aka ‘MLIR for hardware’

Recent Talks

Recent Publications

EVEREST: A design environment for extreme-scale big data analytics on heterogeneous platforms

DISC: A Dynamic Shape Compiler for Machine Learning Workloads