Welcome to the 68th issue of the MLIR Newsletter covering developments in MLIR, and related projects in the ecosystem. We welcome your contributions (contact: javed.absar@gmail.com). Click here to see previous editions.
Highlights, Discussions & RFCs
-
The LLVM Foundation invites all developers and users of LLVM and related sub-projects are invited to present at the 2024 LLVM Developers’ Meeting ! This conference will be held in Santa Clara, CA on October 22-24.
-
[RFC] Linalg OpDSL constant list attribute definition?. Renato hit a snag when trying to … “remove the
transpose_(a|b)
frommatmul
andbatch_matmul
, I have hit a snag on the indexing. I really don’t want to create a whole new DSL just for this case, …” -
[RFC]: A polynomial approximation pass. Jeremy Kun is looking for early feedback on a design sketch. He “plans to start working on a prototype in the HEIR project (GitHub - google/heir: A compiler for homomorphic encryption ) until the design kinks are worked out and it’s ready to upstream”.
-
[RFC] Vector Distribution for CPU (convert vector to physical register size vector). This pass is proposed so that … “we can use
tile vector
and vectorizefor loop
according to the hardware SIMD instructions.tile vector
means that the vector don’t care about the hardware information like how big a vector can be stored in acpu register
. We directly operate on thetile vector
due to some passes do not need to care about hardware information which can reduce the difficulty of pass transformation”. -
Active discussions on:
– [RFC] Region-based control-flow with early exits in MLIR ;
– RFC: [mlir][Vector][Affine] SuperVectortize: Optimization for misaligned data - #3 by codemzs ;
– [RFC] Improvements in the 'quant' dialect
– [RFC] Sharding Framework Design for Device Mesh - #136 by sogartar
MLIR Commits Recently:
-
Matthias implemented the
BufferizableOpInterface
forlinalg.softmax
. The op is not aLinalgOp
, so it is not covered by the “catch all”LinalgOp
interface implementation. [click here]. -
Javed added a new mlir-opt pass
--linalg-specialize-generic-ops
which lifts generic, where possible, to linalg named ops. Much like-linalg-generalize-named-ops
lowers named ops to linalg.generic . [click here]. -
Hsiangkai Wang implemented Conv2D using Winograd Conv2D algorithm. The implementation is based on the paper, Fast Algorithm for
Convolutional Neural Networks. ([1509.09308] Fast Algorithms for Convolutional Neural Networks). [click here]. -
srcarroll introduced pattern rewrites for reducing the rank of named
linalg contraction ops with unit spatial dim(s) to other named
contraction ops. For examplelinalg.batch_matmul
with batch size 1 →
linalg.matmul
andlinalg.matmul
with unit LHS spatial dim →
linalg.vecmat
, etc.[click here]. -
Quinn Dawkins landed diff which allows bubbling
tensor.pack
throughtensor.pad
when the pad has multiple uses. A new pad is created and atensor.unpack
is inserted to connect the packed pad with the new users. To keep the previous behavior, the layout propagation control function
can be modified to disallow multi-use propagation. [click here]. -
Crash fixed in vector.insert. The
InsertOpConstantFolder
had assumed that whenever the destination can be folded to a constant attribute, that attribute must be aDenseElementsAttr
. That is not necessarily the case always. [click here]. -
Han-Chung added Switch to use OpOperand* in ControlPropagationFn. It’s not easy to determine whether we want to propagate pack/unpack ops
because we don’t know the (producer, consumer) information. This
revisions switches it toOpOperand*
, so the control function can capture
the (producer, consumer) pair. [click here]. -
Felix fixed crashes in parser on linalg ops without operands. [click here].
-
Simplify bare pointer handling. Before this commit, there used to be a workaround in the
func.func
/gpu.func
op lowering when the bare-pointer calling convention is enabled. This workaround “patched up” the argument materializations for memref arguments. This can be done directly in the argument materialization functions (as the TODOs in the code base indicate). This commit effectively reverts back to the old implementation [click here]. -
Srcarroll refactored LoopFuseSiblingOp and support for parallel fusion. The patch refactors code related to
LoopFuseSiblingOp
transform in attempt to reduce duplicate common code. The aim is to refactor as much as possible to a functions onLoopLikeOpInterface
. A full refactor will require more additions to theLoopLikeOpInterface
. [click here]. -
Donald Chen fixed bufferize deallocation error in nested symbols. [click here].
-
Renato removed linalg matmul unsigned. According to Renato, this is the first PR in a list of many that will simplify the linalg operations by using similar attributes. [RFC] Transpose attribute for Linalg matmul operations
-
Cullen Rhodes added a new vector.step operation to Vector dialect [click here].
-
Matthias fixed bug in bufferization of elementwise ops. There is an optimization in One-Shot Bufferize wrt. ops that bufferize to elementwise access. A copy can sometimes be avoided. [click here].
-
This diff generalizes
DropUnitDimFromElementwiseOps
to support inner unit dimensions. [click here]
Related Projects
- Triton community meeting - https://www.youtube.com/watch?v=uRlqolhNbRk
- IREE community meeting - https://www.youtube.com/watch?v=b779to--7es
- OpenXLA community meeting - https://www.youtube.com/watch?v=YK1CLzIcsJ8&t=2s
Useful Links