[MLIR Affine]Error with incompatible affine passes

Hello. I’m exploring the usage of mlir passes for lowering and optimization, and I get an assertion error in a specific pass sequence.

Below is my code called “temp.mlir” to illustrate the issue:

func.func @test_reduce_sum(%arg0: tensor<13x21x3xf32>) -> tensor<21x3xf32> {
  %0 = "tosa.reduce_sum"(%arg0) {axis = 0 : i64} : (tensor<13x21x3xf32>) -> tensor<1x21x3xf32>
  %1 = "tosa.reshape"(%0) {new_shape = [21, 3]} : (tensor<1x21x3xf32>) -> tensor<21x3xf32>
  return %1 : tensor<21x3xf32>
}

Steps to reproduce:

 mlir-opt temp.mlir 
 -pass-pipeline=func.func(tosa-to-linalg)
 -linalg-bufferize
 -convert-linalg-to-affine-loops
 -affine-parallelize
 -affine-loop-tile

The execution crashes with following backtrace:

mlir-opt: /data/llvm/mlir/lib/Dialect/Affine/Analysis/Utils.cpp:519: mlir::LogicalResult mlir::MemRefRegion::compute(mlir::Operation *, unsigned int, const mlir::ComputationSliceState *, bool): Assertion `isValidSymbol(symbol)' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: mlir-opt temp.mlir -pass-pipeline=func.func(tosa-to-linalg) -linalg-bufferize -convert-linalg-to-affine-loops -affine-parallelize -affine-loop-tile -debug
 #0 0x000000000047f08a llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /data/llvm/llvm/lib/Support/Unix/Signals.inc:569:11
 #1 0x000000000047f23b PrintStackTraceSignalHandler(void*) /data/llvm/llvm/lib/Support/Unix/Signals.inc:636:1
 #2 0x000000000047d8b6 llvm::sys::RunSignalHandlers() /data/llvm/llvm/lib/Support/Signals.cpp:103:5
 #3 0x000000000047f965 SignalHandler(int) /data/llvm/llvm/lib/Support/Unix/Signals.inc:407:1
 #4 0x00007f72a941f980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980)
 #5 0x00007f72a830fe87 raise /build/glibc-CVJwZb/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
 #6 0x00007f72a83117f1 abort /build/glibc-CVJwZb/glibc-2.27/stdlib/abort.c:81:0
 #7 0x00007f72a83013fa __assert_fail_base /build/glibc-CVJwZb/glibc-2.27/assert/assert.c:89:0
 #8 0x00007f72a8301472 (/lib/x86_64-linux-gnu/libc.so.6+0x30472)
 #9 0x00000000029749f0 mlir::MemRefRegion::compute(mlir::Operation*, unsigned int, mlir::ComputationSliceState const*, bool) /data/llvm/mlir/lib/Dialect/Affine/Analysis/Utils.cpp:521:29
#10 0x0000000002978bd7 getMemoryFootprintBytes(mlir::Block&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, int)::$_5::operator()(mlir::Operation*) const /data/llvm/mlir/lib/Dialect/Affine/Analysis/Utils.cpp:1302:21
#11 0x0000000002978b0d mlir::WalkResult llvm::function_ref<mlir::WalkResult (mlir::Operation*)>::callback_fn<getMemoryFootprintBytes(mlir::Block&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, int)::$_5>(long, mlir::Operation*) /data/llvm/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#12 0x0000000002c94acc llvm::function_ref<mlir::WalkResult (mlir::Operation*)>::operator()(mlir::Operation*) const /data/llvm/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#13 0x0000000002c94650 mlir::detail::walk(mlir::Operation*, llvm::function_ref<mlir::WalkResult (mlir::Operation*)>, mlir::WalkOrder) /data/llvm/mlir/lib/IR/Visitors.cpp:181:12
#14 0x0000000002c945ce mlir::detail::walk(mlir::Operation*, llvm::function_ref<mlir::WalkResult (mlir::Operation*)>, mlir::WalkOrder) /data/llvm/mlir/lib/IR/Visitors.cpp:174:13
#15 0x0000000002c945ce mlir::detail::walk(mlir::Operation*, llvm::function_ref<mlir::WalkResult (mlir::Operation*)>, mlir::WalkOrder) /data/llvm/mlir/lib/IR/Visitors.cpp:174:13
#16 0x0000000002c945ce mlir::detail::walk(mlir::Operation*, llvm::function_ref<mlir::WalkResult (mlir::Operation*)>, mlir::WalkOrder) /data/llvm/mlir/lib/IR/Visitors.cpp:174:13
#17 0x0000000002978aa2 std::enable_if<llvm::is_one_of<mlir::Operation*, mlir::Operation*, mlir::Region*, mlir::Block*>::value, mlir::WalkResult>::type mlir::detail::walk<(mlir::WalkOrder)1, getMemoryFootprintBytes(mlir::Block&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, int)::$_5&, mlir::Operation*, mlir::WalkResult>(mlir::Operation*, getMemoryFootprintBytes(mlir::Block&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, int)::$_5&) /data/llvm/mlir/include/mlir/IR/Visitors.h:170:10
#18 0x0000000002978a08 std::enable_if<std::is_same<mlir::WalkResult, mlir::WalkResult>::value, mlir::WalkResult>::type mlir::Block::walk<(mlir::WalkOrder)1, getMemoryFootprintBytes(mlir::Block&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, int)::$_5, mlir::WalkResult>(llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, getMemoryFootprintBytes(mlir::Block&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, int)::$_5&&) /data/llvm/mlir/include/mlir/IR/Block.h:311:11
#19 0x000000000297863f getMemoryFootprintBytes(mlir::Block&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void>, false, false>, int) /data/llvm/mlir/lib/Dialect/Affine/Analysis/Utils.cpp:1293:23
#20 0x00000000029785b9 mlir::getMemoryFootprintBytes(mlir::AffineForOp, int) /data/llvm/mlir/lib/Dialect/Affine/Analysis/Utils.cpp:1333:10
#21 0x000000000067d023 (anonymous namespace)::LoopTiling::getTileSizes(llvm::ArrayRef<mlir::AffineForOp>, llvm::SmallVectorImpl<unsigned int>*) /data/llvm/mlir/lib/Dialect/Affine/Transforms/LoopTiling.cpp:117:26
#22 0x000000000067c7c7 (anonymous namespace)::LoopTiling::runOnOperation() /data/llvm/mlir/lib/Dialect/Affine/Transforms/LoopTiling.cpp:174:9
#23 0x00000000029bbf7a mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /data/llvm/mlir/lib/Pass/Pass.cpp:471:21
#24 0x00000000029bc574 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /data/llvm/mlir/lib/Pass/Pass.cpp:534:16
#25 0x00000000029c1988 mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_12::operator()(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&) const /data/llvm/mlir/lib/Pass/Pass.cpp:754:36
#26 0x00000000029c15f9 mlir::LogicalResult mlir::failableParallelForEach<__gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo>>>, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_12&>(mlir::MLIRContext*, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo>>>, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo>>>, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_12&) /data/llvm/mlir/include/mlir/IR/Threading.h:46:18
#27 0x00000000029bd853 mlir::LogicalResult mlir::failableParallelForEach<std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo>>&, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_12&>(mlir::MLIRContext*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo>>&, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_12&) /data/llvm/mlir/include/mlir/IR/Threading.h:92:10
#28 0x00000000029bd10d mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool) /data/llvm/mlir/lib/Pass/Pass.cpp:764:14
#29 0x00000000029bc227 mlir::detail::OpToOpPassAdaptor::runOnOperation(bool) /data/llvm/mlir/lib/Pass/Pass.cpp:655:5
#30 0x00000000029bbf6b mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /data/llvm/mlir/lib/Pass/Pass.cpp:468:5
#31 0x00000000029bc574 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /data/llvm/mlir/lib/Pass/Pass.cpp:534:16
#32 0x00000000029bde3c mlir::PassManager::runPasses(mlir::Operation*, mlir::AnalysisManager) /data/llvm/mlir/lib/Pass/Pass.cpp:837:10
#33 0x00000000029bdd5c mlir::PassManager::run(mlir::Operation*) /data/llvm/mlir/lib/Pass/Pass.cpp:817:60
#34 0x00000000029b539c performActions(llvm::raw_ostream&, bool, bool, llvm::SourceMgr&, mlir::MLIRContext*, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>) /data/llvm/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:84:17
#35 0x00000000029b50c3 processBuffer(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, bool, bool, bool, bool, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, llvm::ThreadPool*) /data/llvm/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:124:12
#36 0x00000000029b4ecf mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, bool, bool, bool, bool, bool)::$_0::operator()(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&) const /data/llvm/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:164:12
#37 0x00000000029b4ded mlir::LogicalResult llvm::function_ref<mlir::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>::callback_fn<mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, bool, bool, bool, bool, bool)::$_0>(long, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&) /data/llvm/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#38 0x0000000002ae1789 llvm::function_ref<mlir::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>::operator()(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&) const /data/llvm/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#39 0x0000000002ae0d65 mlir::splitAndProcessBuffer(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<mlir::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>, llvm::raw_ostream&, bool, bool) /data/llvm/mlir/lib/Support/ToolUtilities.cpp:28:12
#40 0x00000000029b4159 mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, bool, bool, bool, bool, bool) /data/llvm/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:169:10
#41 0x00000000029b429a mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, mlir::PassPipelineCLParser const&, mlir::DialectRegistry&, bool, bool, bool, bool, bool) /data/llvm/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:187:10
#42 0x00000000029b4c06 mlir::MlirOptMain(int, char**, llvm::StringRef, mlir::DialectRegistry&, bool) /data/llvm/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:269:14
#43 0x0000000000411c2e main /data/llvm/mlir/tools/mlir-opt/mlir-opt.cpp:237:7
#44 0x00007f72a82f2c87 __libc_start_main /build/glibc-CVJwZb/glibc-2.27/csu/../csu/libc-start.c:344:0
#45 0x0000000000411aca _start (/data/llvm/mlir/build/bin/mlir-opt+0x411aca)
Aborted (core dumped)

I think incompatible affine passes -affine-parallelize -affine-loop-tile cause this error.

Thanks,
Bealle

You can run mlir-opt with --mlir-print-ir-after-all or --mlir-print-ir-before-all and find out which of the passes crashes and why.
If you assume there’s a bug in one of them you can add it to github.
You can also start by looking here: Issues · llvm/llvm-project · GitHub at the open bugs in affine dialect. Perhaps your bug is already there :slight_smile:

Thank you. I will try to find out what caused the crash.
It looks like a bug, and I have just submitted this issue to GitHub: [MLIR]Crash with incompatible affine passes.

When I remove the last pass ‘-affine-loop-tile’, it can be executed successfully. Lowered mlir is as follows:

module {
  func.func @test_reduce_sum(%arg0: tensor<13x21x3xf32>) -> tensor<21x3xf32> {
    %cst = arith.constant 0.000000e+00 : f32
    %0 = bufferization.to_memref %arg0 : memref<13x21x3xf32>
    %1 = memref.alloc() {alignment = 128 : i64} : memref<21x3xf32>
    affine.parallel (%arg1) = (0) to (21) {
      affine.parallel (%arg2) = (0) to (3) {
        affine.store %cst, %1[%arg1, %arg2] : memref<21x3xf32>
      }
    }
    %2 = memref.alloc() {alignment = 128 : i64} : memref<21x3xf32>
    memref.copy %1, %2 : memref<21x3xf32> to memref<21x3xf32>
    affine.for %arg1 = 0 to 13 {
      affine.parallel (%arg2) = (0) to (21) {
        affine.parallel (%arg3) = (0) to (3) {
          %4 = affine.load %0[%arg1, %arg2, %arg3] : memref<13x21x3xf32>
          %5 = affine.load %2[%arg2, %arg3] : memref<21x3xf32>
          %6 = arith.addf %4, %5 : f32
          affine.store %6, %2[%arg2, %arg3] : memref<21x3xf32>
        }
      }
    }
    %3 = bufferization.to_tensor %2 : memref<21x3xf32>
    return %3 : tensor<21x3xf32>
  }
}

I wonder why affine.for was not converted to affine.parallel fully by -affine-parallelize pass. Why keep the outer affine.for?

The crash will appear if I add -affine-loop-tile to the end of execution command. I think the mixed loop of affine.for and affine.parallel cannot be handled by -affine-loop-tile pass.

Interestingly, the problem occurs when lowering some specific tosa ops, e.g., tosa.reduce_sum,reduce_prod, tosa.reduce_min. I think these tosa reduction ops can trigger this crash as they have the same affine structure after lowering. Only these ops have mixed loop expression after affine parallelization, and others don’t have this problem.

Because it is not a parallel loop. The affine.store %6, %2[%arg2, %arg3] : memref<21x3xf32> writes to the same memory location for different iterations of the outer loop, which would create a race condition should these iteration be executed concurrently. I had written code that detects reductions, but don’t remember if it was committed in MLIR or in Polygeist downstream, or at all.

Affine loop tiling likely predates affine.parallel and therefore cannot handle it. There is very little maintenance of this part of the code base.

Thank you clear up my confusion. Coexisting of affine.for and affine.parallel in a loop may lead to potential errors in subsequent optimization, e.g., loop tiling and loop fusion, because there seems to be no limit affine.tile must predate affine.parallel. They’re all optimization passes.

BTW, like this crash caused by incompatible passes, is this a bug?

All of these crashes are to be treated as bugs. As @ftynse mentions, many of these passes were written before things in MLIR were (or could be) generalized, or with assumptions on what mix of ops that pass would see (for eg., tiling would typically be applied before parallelization or conversion to affine.parallel, etc.). Many of these issues are also pretty easy to fix or are good beginner tasks. Contributions are welcome. In case you prefer these are fixed by someone else, please feel free to file an issue on LLVM GitHub issues and assign it to me. Thanks.

2 Likes