To get the above tiling work with transform.structured.fuse, I had to make the following modifications to tileConsumerAndFuseProducers:
--- a/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp
+++ b/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp
@@ -1045,7 +1045,9 @@ mlir::scf::tileConsumerAndFuseProducersUsingSCF(
};
std::deque<tensor::ExtractSliceOp> candidates;
- addCandidateSlices(tiledAndFusedOps.back(), candidates);
+ for (auto *op : tiledAndFusedOps)
+ addCandidateSlices(op, candidates);
+
OpBuilder::InsertionGuard g(rewriter);
while (!candidates.empty()) {
// Traverse the slices in BFS fashion.
@@ -1087,7 +1089,8 @@ mlir::scf::tileConsumerAndFuseProducersUsingSCF(
fusedResult->tiledAndFusedProducer.getDefiningOp()) {
fusedProducers.insert(fusedResult->origProducer.getDefiningOp());
tiledAndFusedOps.insert(tiledAndFusedOp);
- addCandidateSlices(tiledAndFusedOp, candidates);
+ for (auto *op : fusedResult->tiledOps)
+ addCandidateSlices(op, candidates);
}
}
While this does not seem to cause any failures in the existing lit test-suite, @qed mentioned that it might be a potential foot-gun.
@MaheshRavishankar mentioned that we might need to extend SCFTilingResult to also contain a list of the tensor.extract_slices created by tiling.