Welcome to the 55th issue of the MLIR Newsletter covering developments in MLIR, and related projects in the ecosystem. We welcome your contributions (contact: email@example.com). Click here to see previous editions.
Highlights and Ecosystem
2023 US LLVM Dev Meeting Oct 10th to 12th [Program].
What's the purpose of PDL pattern?. Mehdi, "PDL is a bit more complex in that it would compile into a “bytecode” format where are runtime when multiple patterns are loaded, their bytecode can be merged and the matching optimized to eliminate redundancies. See the talk: 2021-04-15: Pattern Descriptor Language ; slides - recording. Another aspect is to be able to decouple the pattern abstraction and application from the “authoring”, see the PDL dialect doc for info, as well as the PDLL DSL documentation (and the presentation from 2021-11-04: PDLL: a Frontend for PDL ; slides - recording
LLVM Weekly [506th Issue].
Mehdi fixed some AffineOps to properly declare their inherent affinemap to make it more consistent with properties. [click here for diff].
Daniil Dudkin: This patch [click here for diff] is part of a larger initiative aimed at fixing floating-point
minoperations in MLIR: [RFC] Fix floating-point `max` and `min` operations in MLIR.
Handle pointer attributes (noalias, nonnull, readonly, writeonly, dereferencable, dereferencable_or_null)for GPUs. [clck here].
Mahesh added move
fillcanonicalization patterns. [clck here for diff].
This [diff] by Amy Wang - enables canonicalization to fold away unnecessary tensor.dim ops which in turn enables folding away of other operations, as can be seen in conv_tensors_dynamic where affine.min operations were folded away.
This [diff] from Vinicius adds support for the zeroinitializer constant to LLVM dialect. It’s meant to simplify zero-initialization of aggregate types in MLIR, although it can also be used with non-aggregate types.
Matthias landed a [diff] which provides a default (Interface) implementation for all ops that implement the
DestinationStyleOpInterface. Result values of such ops are tied to
operand, and those have the same type.
This [linalg patch] allows to supply an optional memory space of the promoted buffer.
In this [commit]) by Matthias Springer: scf.forall ops without shared outputs (i.e., fully bufferized ops) are lowered to scf.parallel. scf.forall ops are typically lowered by an earlier pass depending on the execution target. E.g., there are optimized lowerings for GPU execution. This new lowering is for completeness (convert-scf-to-cf can now lower all SCF loop constructs) and provides a simple CPU lowering strategy for testing purposes. scf.parallel is currently lowered to scf.for, which executes sequentially. The scf.parallel lowering could be improved in the future to run on multiple threads.
In this [alloc-to-alloca conversion for memref] from Alex Zinenko introduces a simple conversion of a memref.alloc/dealloc pair into an alloca in the same scope. Expose it as a transform op and a pattern. Allocas typically lower to stack allocations as opposed to alloc/dealloc that lower to significantly more expensive malloc/free calls. In addition, this can be combined with allocation hoisting from loops to further improve performance.
Nicholas Vasilache - [commit] Extract datalayout string attribute setting as a separate module pass. FuncToLLVM uses the data layout string attribute in 3 different ways:
– LowerToLLVMOptions options(&getContext(), getAnalysis().getAtOrAbove(m));
– options.dataLayout = llvm::DataLayout(this->dataLayout);
– m->setAttr(…, this->dataLayout)); .
The 3rd way is unrelated to the other 2 and occurs after conversion, making it confusing. This revision separates this post-hoc module annotation functionality into its own pass. The convert-func-to-llvm pass loses its
data-layoutoption and instead recovers it from the
llvm.data_layoutattribute attached to the module, when present. In the future,
LowerToLLVMOptions options(&getContext(), setAnalysis<DataLayoutAnalysis>().getAtOrAbove(m))and
options.dataLayout = llvm::DataLayout(dataLayout);should be unified.
MLIR RFC Discussions
In the “ConversionTarget” why do we have both “addLegalDialect” and “AddIllegalDialect” ? Cant you infer one from the another ? Whatever is not there in the legal, could be treated as illegal right ? why to mark something illegal explicitly ? — No there is also “unknown” legality Dialect Conversion - MLIR and the effect differs depending on mode of conversion as mentioned there.
Qs: “… the difference between
sccp. As I have seen, both will use folders and constant materializers to replace ops with constants.” — Answer: “CCP is using the dataflow framework to do control flow analysis, it can infer that something is a constant from this analysis. Canonicalization is a very local transformation that eagerly turns values into constant and tries to iterate greedily.”
Questions on bufferization. Some answers from Matthias — "
– The bufferization will only look at ops that have a tensor operand or tensor result.
to_memrefare used internally to connect bufferized IR and not-yet-bufferized IR. Kind of like
unrealized_conversion_cast, but for memref->tensor and tensor->memref conversions. These conversion ops can also survive bufferization in case of partial bufferization. These ops don’t work with other types. Various other parts of the code base also assume tensor/memref types. I was looking at generalizing this to arbitrary “buffer” types (not just memref) at some point, but didn’t have a use case for it.
– The analysis maintains alias sets and equivalence sets. These are sets of tensor SSA value. There is no tensor SSA value here. Maybe we could put the entire
!dialect.structvalue in there