dpotop
March 24, 2020, 7:40pm
1
Hello, I extended mlir-opt with a dialect of mine, for which I wrote
just the lowering pass. I then call:
./main --lower-mydialect --convert-linalg-to-affine-loops --inline
–affine-loop-fusion --fusion-maximal --canonicalize
–inline examples/inline-bug.mlir
But when I look at the generated code of the inner loop of my code,
I can see:
affine.store %6, %4[%arg1, %arg2] : memref<10x20xf64>
%7 = affine.load %4[%arg1, %arg2] : memref<10x20xf64>
affine.store %7, %1[0, 0] : memref<1x1xf64>
%8 = affine.load %1[0, 0] : memref<1x1xf64>
affine.store %8, %5[%arg2, %arg1] : memref<20x10xf64>
%9 = affine.load %0[%arg0, 0] : memref<10x1xf64>
%10 = affine.load %5[%arg2, %arg1] : memref<20x10xf64>
Obviously, there are store/load sequences where the load is useless.
Is this normal?
Dumitru
You may want to try -memref-dataflow-opt
as well.
(also if you can, please provide actual reproducers in general)
1 Like
dpotop:
affine.store %6, %4[%arg1, %arg2] : memref<10x20xf64> %7 = affine.load %4[%arg1, %arg2] : memref<10x20xf64> affine.store %7, %1[0, 0] : memref<1x1xf64> %8 = affine.load %1[0, 0] : memref<1x1xf64> affine.store %8, %5[%arg2, %arg1] : memref<20x10xf64> %9 = affine.load %0[%arg0, 0] : memref<10x1xf64> %10 = affine.load %5[%arg2, %arg1] : memref<20x10xf64>
As @mehdi_amini points out, you need -memref-dataflow-opt
to forward the stores to the loads; the fusion pass won’t do it for you. I just checked and -memref-dataflow-opt
does get rid of those store/load’s on your snippet, and it also removes intermediary allocations that become dead as a result.
1 Like