I’ve put up [mlir] Add a contiguous<perm, offset> layout, use as identity layout by krzysz00 · Pull Request #131663 · llvm/llvm-project · GitHub for this change, but since it’s a substantial change to a core MLIR primitive, I figure it deserves an RFC thread so people who aren’t watching the PRs know to comment.
To reproduce the description:
This PR introduces a new ContiguousLayoutAttr, which holds a permutation of the dimensions of the memref and an optional offset, and replaces the default memref layout (which was previously the N-D identity map) with contiguous.
In general, the syntax for this attribute is
contiguous<[I0, I1, .. IN], offset: O>
where I0
through IN
are integers in 0..=N and O is either a static offset or ? for a dynamic one. If the offset is 0, the offset isn’t printed, while if the permutation is [0, 1, ... N]
, we print it as N+1. That is, the 2-D identity/row-major layout is contiguous<2>
and not (d0, d1) -> (d0, d1)
like it used to be.
Motivation
In summary, the contiguous<> layout both fills in the “layout hierarchy” (all contiguous layouts are strided, and all strided layouts are affine maps, but you can’t go back down) with a primitive that enables useful optimizations and makes it easier to have relocatable/mergable allocations in MLIR code.
Consider memref<?0 x ?1 x ?2 x T>
- a memref with three dynamic dimensions. This memref has a row-major identity layout.
Suppose I want to make this memref “relocatable” - declare that it has an unonwn offset so that I can, for example, have a pass that merges allocations into larger contiguous buffers. With the current layouts in MLIR, I can either use:
strided<[?, ?, 1], offset: ?>
, which loses the fact that this is a row-major memref. We don’t know the relationship of those two?
s to each other.(d0, d1, d2)[s0] -> (d0, d1, d2 + s0)
, which isn’t a “strided” layout by existing definitions and encounters the fact that meany memref operations don’t handle non-strided or arbitrary affine layouts.
Being able to use contiguous<3, offset: ?>
(or, in its long form, contiguous<[0, 1, 2], offset: ?>`) resolves this isue. That is now a strided layout that directly encodes the fact that this is a 3-D row-major memref with some dynamic offset.
As seen in my changes to some passes like gpu-decompose-memrefs
or the vector transfer op flattener, knowing that a layout is contiguous - if not necessarily row-major, allows us to use operations like affine.linearize_index
for index computations, which fold well with operations like affine.delinearize_index
, allowing for eliminating unnecessariy “divide an ID into parts and multiply them together again” computations that often come up in tiling-based code generation that the affine map simplifier has difficulty with or generates inefficiently.
This layout also allows describing permuted layouts, like column-major layouts, without needing code to handle the general complexity of an affine map layout. For exmample,
memref.expand_shape %arg [[0, 1], [2]]
: memref<?x?xi32, contiguous<[1, 0]>
into memref<?x?x?xi32, contiguous<[1, 2, 0]>
accurately describes the effects of expand_shape’ing a column-major memref.
Why change the default layout?
Since the built-in layout attributes form a hierarchy of specificy (all contiguous layouts are strided …), there are multiple ways to represent the identity row-major layout. The contiguous layout is the most specific of these, so it makes sense to declare it the canonical form of the identity layout. That is, strided<[?, ?, 1]>
is less specific of a layout for memref<?x?x?xi32>
. The identity affine_map also has non-canonical forms and is less spcefici: code that can handle te identity AffineMapAttr may not know what to do with other affine maps because of how general they are, but it will be easier to go from the identity ContiguousLayoutAttr to permuted and/or offset attributes.
Therefore, making the contiguous layout the default form of MemRefLayoutAttrInterface makes writing memref-handling code easier going forward.
Concrete impacts of the change
memref<...xT, affine_map<(d0, d1, ..., dN) -> (d0, d1, ... dN)>
no longer prints asmemref<...xT>
.- Similarly, the default memref layout is no longer an AffineMapAttr. This didn’t break any code in-tree, since almost everything had moved to MemRefLayoutAttrInterface::getAffineMap(), but it’s worth calling out.
memref.subview
,memref.reinterperet_cast
, and so on do not alwasy produce astrided
layout: if code needed to createstrided<[], offset: O>
, it’ll now createcontiguous<0, offset: O>
and similarly forstrided<[1], offset: O>
, which is a 1-D contiguous layout. This is facilitated by the newStridedLayout::getCanonical
method, which doesn’t always return a strided layout- Some passes have been updated to use
affine.linearize_index disjoint
when they were flatting a contiguous (subset of) a memref, allowing for more efficient code generatino compared to anaffine.apply
over the strides. getStridesAndOfffset()
has learned a new trick for affine maps: any “offset permutation” (that is, a permutation where the last result can be dX + E for any E) is now considered strided. This means that you can nowgetStridesAndOffset
a
memref<MxNxf32, affine_map<(i, j) -> (j, i)>
, which would previously fail.MemRefType::canonicalizeLayout
has been updated to canonicalize strided layouts to theircontiguous
equivalent for static-shaped memrefs.bufferization.buffer_layout
can be anyMemRefLayoutAttrInterface
, and any identity maps present in such attributes are transparently migrated to their contiguous<> equivalents.- Certain reshape folders will now work with any row-major layout, even if it has an offset.
While this is a breaking change, we expect that it will allow long-term improvments to how MLIR represents memrefs in common situations.