Hi everyone,
TL;DR: I’m thinking about making a few cleanup changes to the Vector dialect and thought that removing the legacy vector.extractelement
and vector.insertelement
operations in favor of their bigger siblings vector.extract
and vector.insert
would be a good start. If we agree on this, this post will serve as an initial PSA of the change. Please let me know what you think! Collaboration on this work is more than welcome!
Background
Currently, we have two operations for extracting elements from a vector value: vector.extractelement
and vector.extract
. vector.extractelement
was introduced in early stages of MLIR as a carbon copy of the LLVM extractelement
instruction. This operation can only extract scalar elements from a 1-D or 0-D vector. Some time later, the Vector dialect evolved towards n-D vectors and vector.extractelement
was introduced to model scalar and sub-vector extractions from an n-D vector. To some extent, vector.extract
is a superset of vector.extractelement
, although the former currently does not support extractions from 0-D vectors. The same applies to the vector.insertelement
and vector.insert
operations.
Proposal
I propose that we add support for 0-D vectors to vector.extract
and vector.insert
and remove vector.extractelement
and vector.insertelement
. The replacement of vector.extractelement/insertelement
’s uses would be mechanical once vector.extract/insert
support 0-D vectors.
This change will improve the Vector dialect by:
- Removing ambiguity from the vector representation when it comes to extracting elements from a vector value.
- Removing redundant patterns/transformations currently implemented for both
vector.extract/insert
and vector.extractelement/insertelement
.
- Filling implementation gaps in patterns/transformations where only one of the operation flavors is supported.
Action Plan
- Introduce 0-D support in
vector.extract/insert
ops.
- Fill any implementation gap in
vector.extract/insert
ops. Make sure that all the vector.extractelement/insertelement
patterns and transformations are also supported for vector.extract/insert
.
- Remove
vector.extractelement/insertelement
from upstream except for the ops themselves. Send another PSA with the imminent removal of the ops. We can add warnings to the ops’ builders for downstream users, if necessary.
- Remove
vector.extractelement/insertelement
operations.
Please let me know if you have any concerns or feedback! This is also a call for collaboration on this work. I’ll definitely need help!
Thanks,
Diego
cc: @nicolasvasilache, @aartbik, @ThomasRaoux, @banach-space, @javiersetoain
4 Likes
Hi Diego,
Thank you for taking the initiative on this. I’m on board, the quest is righteous and the plan is sound You can count on me to help with the work. If you write the initial support for 0-D in insert/extract
, we can make a list issues with whatever breaks when replacing insert/extractelement
with insert/extract
, and I’ll start chipping away at it. Feel free to add me as a reviewer for all related patches, and we can jump into a call to coordinate if necessary.
Cheers!
Javier
2 Likes
+1
This is a good example of “2 Ops would be better than 4” Thanks for proposing this Diego! I am also available to help. Either as a reviewer or more hands-on, whatever works better.
-Andrzej
1 Like
+1 thanks much for looking into reducing redundancy there!
The action item #1 should be completed with ⚙ D152644 [mlir][Vector] Add support for 0-D vectors to vector.insert/extract.
I’m now looking at #2 and, in particular, at adding support for Value
indices to vector.extract/insert
ops. They only support constant indices right now. The implementation is almost done but there are around 88 lit test files that would need to be updated to the new vector.extract/insert
format. Unfortunately, I can’t think of a ninja regex way to update them so I would really appreciate some help here. Basically, we have to replace check rules like:
// CHECK-NEXT: %[[R2:.+]] = vector.extract %{{.*}}[0, 0] : vector<1x1x16xf32>
with something like:
// CHECK-DAG: %[[C0:.+]] = arith.constant 0 : index
// CHECK-DAG: %[[C0_1:.+]] = arith.constant 0 : index
...
// CHECK-NEXT: %[[R2:.+]] = vector.extract %{{.*}}[%[[C0]], %[[C0_1]]] : vector<1x1x16xf32>
where C0
, C0_1
, etc. can be random indices and be located anywhere in the output.
If anybody has cycles to work on this, please call out the tests you want to work on in the review. You can share your diffs over GH also there. I’ll make sure those contributing are added as co-authors. To repro, you can just apply the ongoing diff and run the check-mlir
target.
Thanks in advance!
Diego
cc: @banach-space, @javiersetoain, @nicolasvasilache, @ThomasRaoux, @hanchung
3 Likes
Just to check, do vector.insert and vector.extract still only support constant indexes? We need variable indexes downstream.
I’ve now seen the patch.
Yeah, you should be able to temporarily apply the ongoing patch but, please, note that the implementation and assembly format are subject to change.
I noticed there hadn’t been an update in a while. So, I sent out a pr for Step #3 of the action plan: [mlir][Vector] Remove uses of vector.extractelement/vector.insertelement by Groverkss · Pull Request #113827 · llvm/llvm-project · GitHub
I also sent out a patch that updates vector.extractelement/vector.insertelement documentation to discourage usage of these ops until they are removed: [mlir][Vector] Depreciate vector.extractelement/vector.insertelement by Groverkss · Pull Request #113829 · llvm/llvm-project · GitHub
Thanks @dcaballe for all the previous work. This patch was mostly cleanup, with the hard parts already done by @dcaballe .
Thanks a lot! The main roadblock I found after my PR was the gaps in canonicalizations/optimizations between the two ops. Not sure those gaps have been addressed already. Have you noticed any impact in performance?
Hi! I did an analysis of the current folders/canonicalizations implemented for vector.extractelement/vector.insertelement and found this:
Canonicalizations
vector.extractelement/vector.insertelement do not have any canonicalizations.
Folders
vector.extractelement has folders for:
vector.insertelement has folders for:
- DenseElementsAttr folder: vector.insert has a canonicalization pattern that does this. The reasoning for it not being a folder is same as vector.extract.
Conversion to LLVM
Conversion to SPIRV
Other transformations
The only pattern that runs on vector.extractelement/vector.insertelement that I could find was: llvm-project/mlir/lib/Dialect/Vector/Transforms/VectorTransferOpTransforms.cpp at main · llvm/llvm-project · GitHub which seems to be implemented for both, vector.extract and vector.extractelement.
What I understand by looking at this, is that vector.extract/vector.insert have support whatever vector.extractelement/vector.insertelement supports + more folders + more transforms.
I’m not sure of what benchmarks to try for performance though, they should ideally generate the same or better code based on the support, but with my next patch to remove uses of vector.extractelement/vector.insertelement, I could see how it affects IREE cpu performance. If you have any particular benchmarks in mind, I’m happy to try them as well.
Let me know if i missed any of the canonicalizers / folders / transformations.
1 Like
Thanks for the analysis! Great to see that point 2 is converging. Yes, I think a quick run of IREE CI should be enough to move forward. Thanks for helping with this!
1 Like
Thank you for this analysis!
Interesting to see the lack of symmetry between vector.extractelement
and vector.insertelement
(looking at folders). Looks like a gap waiting to be filled
Given that all of this is already in progress, we might have missed the opportunity to benchmark “before” and “after”.
Given that this is already in progress, we might have missed our opportunity to track the perf impact Still, could you try a deeplabv3
and a bert
benchmark? Not too concerned which variant you choose.
Thanks for this great clean-up work!
-Andrzej