See the previous published edition.
Welcome to the fourteenth issue of the MLIR (bi)Weekly, a newsletter (published on Friday) covering developments in MLIR, and related projects in the ecosystem. MLIR (bi)Weekly is brought to you by a collective effort of contributors, we welcome your contributions!
- The “HLO Compiler” stack has been refactored from TensorFlow and exposed as a standalone project: GitHub - tensorflow/mlir-hlo ; it is intended as an incubator of technology expected to be upstreamed. MLIR NPComp already depends on it.
- The C API and Python Bindings are making steady progress.
- Dialect registration is being revamped, phase 1 landed exposing the new way of registering dialects with the context ; the global registry is deprecated and will be removed “soon”.
- Dynamic pass pipeline is a new proposed feature, which can be used for example to define a sub-pipeline that would run only on an operation with specific attribute, or compute a cost model on a function and dynamic schedule a pipeline based on the result of this analysis.
- Support for
DialectHookshas been removed, with the existing hooks being replaced with dialect interfaces.
- The removal of
kindsfrom Attributes/Types has been completed.
- Support for merging blocks has been added to the PatternRewriter.
Optimizations and Code Generation
- Nicolas presented Progress on CodeGen for the Vector Dialect at MLIR open design.
- Implemented canonicalization of all 1-D masked memory operations in Vector dialect:
- operations on all-one/all-zero masks are optimized into simpler, unmasked memory operations,
- although also done by LLVM backend, this simplifies vectors ops much earlier which results in shorter IR and may expose further optimizations,
- prepares progressive lowering of transfer ops.
- A very preliminary experiment with sparse computations running on CPU have been conducted to compare compiler generated code tailored for SIMD with general-purpose solutions in the Intel MKL and Eigen libraries. Results are promising, more on that later.
Initial patch for mlir-spirv-cpu-runner that converts a SPIR-V module into LLVM module, compiles and executes it on CPU sent out for review. This only handles SPIR-V kernels that represent scalar code, i.e. no multi-threading/parallelism. It still is a valuable way of evaluating the SPIR-V → LLVM conversion.
- All the conversion needed to lower the gpu.launch_func into LLVM calls, also sent out for reviews as a separate patch.
- Lowering from Linalg to SPIR-V in IREE and in MLIR was presented at the MLIR open design meeting as part of the IREE Code Generation discussion.
- A dialect for OpenACC was introduced with three operations (parallel, loop, data). See the RFC for an introduction.
- Added support for a few clauses of the OpenMP parallel operation (if, num_threads and proc_bind).
In the Ecosystem
mlir-npcomp: Prototype for compiling numpy programs
An ATen Dialect and initial PyTorch interface to MLIR, with a reference lowering calling back into PyTorch have landed.
The “HLO Compiler” stack has been refactored and exposed as a standalone project: GitHub - tensorflow/mlir-hlo
CIRCT : Circuit IR Compilers and Tools aka ‘MLIR for hardware’
An initial SystemVerilog dialect has landed.
- MLIR Vector Dialect: Structured and Retargetable Compilation With Vectors slides recording
- IREE CodeGen slides recording transcript
Compiling ONNX Neural Network Models Using MLIR
To represent neural network models, users often use Open Neural Network Exchange (ONNX) which is an open standard format for machine learning interoperability. We are developing a compiler for rewriting a model in ONNX into a standalone binary that is executable on different target hardwares such as x86 machines, IBM Power Systems, and IBM System Z. The compiler was written using Multi-level Intermediate Representation (MLIR), a modern compiler infrastructure. In particular, we introduce two internal representations: ONNX IR for representing ONNX operators, and Kernel IR for efficiently lowering ONNX operators into LLVM bitcode. In this paper, we will discuss the overall structure of our compiler and give some practical examples of converting ONNX operators and models. We also cover several issues related to endianness. Our framework is publicly available as an open source project under the ONNX project.
This 8 pages paper is a good introduction to ONNX and gives some insights into how the Affine dialect is leveraged in the compilation flow.