MLIR News, 10th edition (6/26/2020)

See the previous published edition.

Welcome to the tenth issue of the MLIR (bi)Weekly, a newsletter (published on Friday) covering developments in MLIR, and related projects in the ecosystem. MLIR (bi)Weekly is brought to you by a collective effort of contributors, we welcome your contributions!

Highlights / General

MLIR Core

Infrastructure

  • Interface internal storage have been revamped: this will speedup the lookup of interfaces, and it provides a generalization on which we’ll build Interface support for Types and Attributes.
  • RewritePatterns are no longer required to provide a specific root operation, and may omit the root to match any operation type: this enables more generic patterns.
  • The way that dialect conversion converts block argument types has been refactored. Patterns are now responsible for converting region block arguments via rewriter.convertRegionTypes.
  • The generate-test-checks.py utility now supports attaching CHECK lines directly to a source file.
  • gen-op-decls now supports filtering which operations to generate via -op-regex=.
  • HasParent trait can now be used to specify multiple possible parent operations.
  • The Op::OperandAdaptor classes have been replaced by more general Op::Adaptor classes.

Optimizations and Code Generation

  • The lowering to the LLVM dialect supports now returning unranked memref.
  • Buffer allocation has grown support for region based control flow, which allows to buffer-allocate the SCF dialect. Next up is support for operations that return an alias of their inputs.
  • Speedups on x86-avx2 go up to 24x for vector.constant_mask and vector.create_mask , using “SIMD” form in the 1-D case. Even though a LLVM constant always results for the former, rather lengthy IR would occur as an intermediate step, sometimes crashing the compiler for long vectors.
  • Floating-point “horizontal” vector reduction are now going through the X86 backend when adding the reassoc fast-math flag on the LLVM vector reduction intrinsic. Speedups on x86-avx2 range from 8x to over 20x compared to strict order scalar reductions (this “super” linear behavior is due to much cleaner, spill-free SIMD code).

SPIR-V

  • More support for SPIR-V matrix types landed. @hazem added op definitions for spv.MatrixTimeScalar, spv.Transpose, and improved spv.AccessChain index type handling.
  • A new pattern by Denis Khalikov to rewrite sequential chains of spv.CompositeInsert into spv.CompositeConstruct landed.
  • The SPIR-V to LLVM conversion GSOC project is making good progress. @george added conversions covering more logical/cast ops, more bitwise and bitfield ops. spv.func and spv.module can also be translated now.
  • The vulkan runner supports more memref element type bitwidths and is fixed to use staging memory and GPU local memory.

Other

  • The shape dialect is evolving very rapidly and gaining multiple lowering ability through scf and std dialects.
  • The mlir-vulkan-runner gained the ability to use GPU device memory

In the Ecosystem

IREE : An Experimental MLIR Execution Environment

  • Cross-compilation towards Android via CMake has landed. This supports IREE core runtime (both VMLA and Vulkan HAL drivers) at the moment. Smoke Tests for both VMLA and Vulkan pass on Android 10.
    • A new table to summarize IREE’s TensorFlow end-to-end test case coverage is online.

TensorFlow

Recent Talks