Hello there,
I have been working on a project for a few months mainly using the affine dialect. My project has a pipeline to perform loop transformation passes. Two days ago, I ran into an issue in one of the passes where the affine map of an affine.load seems incorrect. I tried to isolate the issue by printing the IR just before this buggy pass and ran this buggy pass on the printed IR. However, when I did that, the affine map seems correct. I have tried to add createCanonicalizerPass()
and createGenerateRuntimeVerificationPass()
before the buggy pass, but they did not help. I am still learning new things about MLIR, but I have never had this issue before, and I am wondering if anyone knows how to approach it. Here is an example of printing the affine.load op, its affine.for loops, and its affine map:
1- Printing from the buggy pass within the pass pipeline:
// printing from the buggy pass withing the pass pipeline
affine.for %arg4 = 0 to 16 {
affine.for %arg5 = 0 to 32 {
affine.for %arg6 = 0 to 32 {
%2 = affine.load %alloc_3[%arg4, %arg5, %arg6] : memref<16x32x32xf32> //the AffineLoadOp
affine.store %2, %alloc_5[%arg4, symbol(%arg5) + 1, symbol(%arg6) + 1] : memref<16x34x34xf32>
}
}
}
%2 = affine.load %alloc_3[%arg4, %arg5, %arg6] : memref<16x32x32xf32> // loadOp.dump();
(d0, d1, d2) -> (d2, d0, d1) // loadOp.getAffineMap().dump(); <= incorrect
(d0, d1, d2) -> (d2, d0, d1) // loadOp.getMap().dump(); <= incorrect
2- Printing from the buggy pass using the standalone command line tool:
// printing from the buggy pass using the standalone command line tool
affine.for %arg4 = 0 to 16 {
affine.for %arg5 = 0 to 32 {
affine.for %arg6 = 0 to 32 {
%2 = affine.load %alloc_3[%arg4, %arg5, %arg6] : memref<16x32x32xf32> //the AffineLoadOp
affine.store %2, %alloc_5[%arg4, symbol(%arg5) + 1, symbol(%arg6) + 1] : memref<16x34x34xf32>
}
}
}
%2 = affine.load %alloc_3[%arg4, %arg5, %arg6] : memref<16x32x32xf32> // loadOp.dump();
(d0, d1, d2) -> (d0, d1, d2) // loadOp.getAffineMap().dump(); <= correct
(d0, d1, d2) -> (d0, d1, d2) // loadOp.getMap().dump(); <= correct
I am using llvmorg-18.1.2 llvm-project @ 26a1d66
I am lowering from linalg to affine using createConvertLinalgToAffineLoopsPass()
, and in the pipeline, I have a pass that performs loop permutations (using the function permuteLoops
from Utils/LoopUtils.h).
I may be missing something, and any help would be appreciated.