See the previous published edition.
Welcome to the tenth issue of the MLIR (bi)Weekly, a newsletter (published on Friday) covering developments in MLIR, and related projects in the ecosystem. MLIR (bi)Weekly is brought to you by a collective effort of contributors, we welcome your contributions!
Highlights / General
- Multiple interesting RFCs have been shared on Discourse over the last two weeks:
- There is a new git pre-push hook and script to sanitize your commit messages: just run
./llvm/utils/git/arcfilter.sh
before pushing.
MLIR Core
Infrastructure
- Interface internal storage have been revamped: this will speedup the lookup of interfaces, and it provides a generalization on which we’ll build Interface support for Types and Attributes.
- RewritePatterns are no longer required to provide a specific root operation, and may omit the root to match any operation type: this enables more generic patterns.
- The way that dialect conversion converts block argument types has been refactored. Patterns are now responsible for converting region block arguments via
rewriter.convertRegionTypes
. - The
generate-test-checks.py
utility now supports attaching CHECK lines directly to a source file. -
gen-op-decls
now supports filtering which operations to generate via-op-regex=
. -
HasParent
trait can now be used to specify multiple possible parent operations. - The
Op::OperandAdaptor
classes have been replaced by more generalOp::Adaptor
classes.
Optimizations and Code Generation
- The lowering to the LLVM dialect supports now returning unranked memref.
- Buffer allocation has grown support for region based control flow, which allows to buffer-allocate the SCF dialect. Next up is support for operations that return an alias of their inputs.
- Speedups on x86-avx2 go up to 24x for
vector.constant_mask
andvector.create_mask
, using “SIMD” form in the 1-D case. Even though a LLVM constant always results for the former, rather lengthy IR would occur as an intermediate step, sometimes crashing the compiler for long vectors. - Floating-point “horizontal” vector reduction are now going through the X86 backend when adding the
reassoc
fast-math flag on the LLVM vector reduction intrinsic. Speedups on x86-avx2 range from 8x to over 20x compared to strict order scalar reductions (this “super” linear behavior is due to much cleaner, spill-free SIMD code).
SPIR-V
- More support for SPIR-V matrix types landed. @hazem added op definitions for
spv.MatrixTimeScalar
,spv.Transpose
, and improvedspv.AccessChain
index type handling. - A new pattern by Denis Khalikov to rewrite sequential chains of
spv.CompositeInsert
intospv.CompositeConstruct
landed. - The SPIR-V to LLVM conversion GSOC project is making good progress. @george added conversions covering more logical/cast ops, more bitwise and bitfield ops. spv.func and spv.module can also be translated now.
- The vulkan runner supports more memref element type bitwidths and is fixed to use staging memory and GPU local memory.
Other
- The shape dialect is evolving very rapidly and gaining multiple lowering ability through
scf
andstd
dialects. - The
mlir-vulkan-runner
gained the ability to use GPU device memory
In the Ecosystem
IREE : An Experimental MLIR Execution Environment
- Cross-compilation towards Android via CMake has landed. This supports IREE core runtime (both VMLA and Vulkan HAL drivers) at the moment. Smoke Tests for both VMLA and Vulkan pass on Android 10.
-
- A new table to summarize IREE’s TensorFlow end-to-end test case coverage is online.
TensorFlow
- The plan for using the MLIR code generation for XLA GPU was shared on the TensorFlow MLIR mailing-list.
Recent Talks
-
Basic GPU Compute Algorithm (
slides - recording) - MLIR Open Design Meeting - Alexander Meißner -
OpenMP in Flang using MLIR - LLVM Compiler and Tools for HPC - ISC-HPC 2020 - Kiran Chandramohan, ARM