See the previous published edition.
Welcome to the thirty-sixth issue of the MLIR (bi)Weekly, a newsletter covering developments in MLIR, and related projects in the ecosystem. MLIR (bi)Weekly is brought to you by a collective effort of contributors, we welcome your contributions!
MLIR Core
Infrastructure
-
Should we rename ModuleOp? Please chime in with naming suggestions
!
- PSA: Dialects now require a generated `MyOpsDialect.cpp.inc` file to be included
- The pass registration API is changing, update your passes!
- SourceMgrDiagnosticHandler now supports filtering locations from call stacks when printing diagnostics.
- New threading utilities were added that simplify writing parallel code in MLIR.
Table-driven Infrastructure
-
ODS now generates accessors for attribute names,
<attr-name>AttrName()
, that return an Identifier.- These should be used over raw strings whenever possible.
Codegen
- Async dialect now has recursive work splitting (“Eigen style”) for parallel operations, which significantly improved performance
In the Ecosystem
IREE : An Experimental MLIR Execution Environment
- IREE CPU backend can compile and execute a dynamically shaped MLP model (PR)
- The IREE stack does not rely on shape dialect for backend shape management. Once converted, Linalg ops (and IREE’s high level ops) themselves carry all the shape information explicitly by design.
- So far, representing broadcasts in a more canonical form by avoiding dynamic-dim broadcasts (and associated reshapes) gets pretty far. Still looking for cases where this approach falls short and possible solutions.
- Work to enable more dynamism is proceeding jointly at the PyTorch/CHLO/TOSA levels of abstraction. We do not believe that MHLO proper is the right level to be doing (ranked) dynamic shapes, but redirecting more work to CHLO has simplified things and revealed patterns that can be applied across the frontends.
- Most instances of static pass registrations have been removed from IREE (PR) – with about 50 remaining in some low level dialects. Some refactoring work in progress to clean up some legacy code and better represent the current state of IREE compilation (PR, PR)
- Target triples and data layout information for LLVM targets plumbed through. (PR)
- CUDA Backend:
- Add new op to GPU dialect to represent constant MMA matrix
- Expand lowering of vector to GPU MMA ops to support
scf
ops
- New facility for saving traces from Python model execution for later replay. Includes updated iree-run-trace and iree-benchmark-trace standalone tools for replaying a trace and benchmarking (minimal dependency, C-based for maximum portability). Will replace more ad-hoc mechanisms for adding benchmarking workloads.
mlir-npcomp: Prototype for compiling numerical python programs
- 2021Q3 roadmap PR
-
torch
dialect is now standalone. There is no longer a dependency onbasicpy
dialect or builtin/std types/ops PR, PR, PR, PR, PR, PR, PR, PR, PR