[MLIR] Problems with lowering affine

I met some problems about the usage of pass -affine-data-copy-generate.

Here is my code temp.mlir :

func.func @test_greater(%arg0: tensor<13x21x1xf32>, %arg1: tensor<13x21x3xf32>) -> tensor<13x21x3xi1> {
  %0 = "tosa.greater"(%arg0, %arg1) : (tensor<13x21x1xf32>, tensor<13x21x3xf32>) -> tensor<13x21x3xi1>
  return %0 : tensor<13x21x3xi1>
}

Execution command:

mlir-opt temp.mlir 
 -pass-pipeline=func.func(tosa-to-linalg) 
 -linalg-bufferize -convert-linalg-to-affine-loops 
 -affine-loop-coalescing 
 -affine-data-copy-generate=generate-dma=false 
 -lower-affine

When trying to run above command, we run into the following crash:

mlir-opt: /data/llvm15/mlir/lib/Dialect/Affine/IR/AffineOps.cpp:1789: auto foldLoopBounds(mlir::AffineForOp)::(anonymous class)::operator()(bool) const: Assertion `boundMap.getNumResults() >= 1 && "bound maps should have at least one result"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: mlir-opt temp.mlir -pass-pipeline=func.func(tosa-to-linalg) -linalg-bufferize -convert-linalg-to-affine-loops -affine-loop-coalescing -affine-data-copy-generate=generate-dma=false -lower-affine
 #0 0x000000000047f08a llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /data/llvm15/llvm/lib/Support/Unix/Signals.inc:569:11
 #1 0x000000000047f23b PrintStackTraceSignalHandler(void*) /data/llvm15/llvm/lib/Support/Unix/Signals.inc:636:1
 #2 0x000000000047d8b6 llvm::sys::RunSignalHandlers() /data/llvm15/llvm/lib/Support/Signals.cpp:103:5
 #3 0x000000000047f965 SignalHandler(int) /data/llvm15/llvm/lib/Support/Unix/Signals.inc:407:1
 #4 0x00007f7576dd9980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980)
 #5 0x00007f7575cc9e87 raise /build/glibc-CVJwZb/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
 #6 0x00007f7575ccb7f1 abort /build/glibc-CVJwZb/glibc-2.27/stdlib/abort.c:81:0
 #7 0x00007f7575cbb3fa __assert_fail_base /build/glibc-CVJwZb/glibc-2.27/assert/assert.c:89:0
 #8 0x00007f7575cbb472 (/lib/x86_64-linux-gnu/libc.so.6+0x30472)
 #9 0x00000000005a4a3e foldLoopBounds(mlir::AffineForOp)::$_24::operator()(bool) const /data/llvm15/mlir/lib/Dialect/Affine/IR/AffineOps.cpp:1790:31
#10 0x0000000000585747 foldLoopBounds(mlir::AffineForOp) /data/llvm15/mlir/lib/Dialect/Affine/IR/AffineOps.cpp:1810:25
#11 0x00000000005855f6 mlir::AffineForOp::fold(llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) /data/llvm15/mlir/lib/Dialect/Affine/IR/AffineOps.cpp:1976:27
#12 0x00000000005f35fa mlir::LogicalResult mlir::Op<mlir::AffineForOp, mlir::OpTrait::OneRegion, mlir::OpTrait::VariadicResults, mlir::OpTrait::ZeroSuccessors, mlir::OpTrait::VariadicOperands, mlir::OpTrait::SingleBlockImplicitTerminator<mlir::AffineYieldOp>::Impl, mlir::OpTrait::OpInvariants, mlir::OpTrait::AutomaticAllocationScope, mlir::OpTrait::HasRecursiveSideEffects, mlir::LoopLikeOpInterface::Trait, mlir::RegionBranchOpInterface::Trait>::foldHook<mlir::AffineForOp>(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) /data/llvm15/mlir/include/mlir/IR/OpDefinition.h:1825:50
#13 0x00000000005f35a1 std::enable_if<!llvm::is_one_of<mlir::OpTrait::OneResult<mlir::AffineForOp>, mlir::OpTrait::OneRegion<mlir::AffineForOp>, mlir::OpTrait::VariadicResults<mlir::AffineForOp>, mlir::OpTrait::ZeroSuccessors<mlir::AffineForOp>, mlir::OpTrait::VariadicOperands<mlir::AffineForOp>, mlir::OpTrait::SingleBlockImplicitTerminator<mlir::AffineYieldOp>::Impl<mlir::AffineForOp>, mlir::OpTrait::OpInvariants<mlir::AffineForOp>, mlir::OpTrait::AutomaticAllocationScope<mlir::AffineForOp>, mlir::OpTrait::HasRecursiveSideEffects<mlir::AffineForOp>, mlir::LoopLikeOpInterface::Trait<mlir::AffineForOp>, mlir::RegionBranchOpInterface::Trait<mlir::AffineForOp>>::value && detect_has_fold<mlir::AffineForOp>::value, llvm::unique_function<mlir::LogicalResult (mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) const>>::type mlir::Op<mlir::AffineForOp, mlir::OpTrait::OneRegion, mlir::OpTrait::VariadicResults, mlir::OpTrait::ZeroSuccessors, mlir::OpTrait::VariadicOperands, mlir::OpTrait::SingleBlockImplicitTerminator<mlir::AffineYieldOp>::Impl, mlir::OpTrait::OpInvariants, mlir::OpTrait::AutomaticAllocationScope, mlir::OpTrait::HasRecursiveSideEffects, mlir::LoopLikeOpInterface::Trait, mlir::RegionBranchOpInterface::Trait>::getFoldHookFnImpl<mlir::AffineForOp>()::'lambda'(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&)::operator()(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) const /data/llvm15/mlir/include/mlir/IR/OpDefinition.h:1785:14
#14 0x00000000005f3545 mlir::LogicalResult llvm::detail::UniqueFunctionBase<mlir::LogicalResult, mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&>::CallImpl<std::enable_if<!llvm::is_one_of<mlir::OpTrait::OneResult<mlir::AffineForOp>, mlir::OpTrait::OneRegion<mlir::AffineForOp>, mlir::OpTrait::VariadicResults<mlir::AffineForOp>, mlir::OpTrait::ZeroSuccessors<mlir::AffineForOp>, mlir::OpTrait::VariadicOperands<mlir::AffineForOp>, mlir::OpTrait::SingleBlockImplicitTerminator<mlir::AffineYieldOp>::Impl<mlir::AffineForOp>, mlir::OpTrait::OpInvariants<mlir::AffineForOp>, mlir::OpTrait::AutomaticAllocationScope<mlir::AffineForOp>, mlir::OpTrait::HasRecursiveSideEffects<mlir::AffineForOp>, mlir::LoopLikeOpInterface::Trait<mlir::AffineForOp>, mlir::RegionBranchOpInterface::Trait<mlir::AffineForOp>>::value && detect_has_fold<mlir::AffineForOp>::value, llvm::unique_function<mlir::LogicalResult (mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) const>>::type mlir::Op<mlir::AffineForOp, mlir::OpTrait::OneRegion, mlir::OpTrait::VariadicResults, mlir::OpTrait::ZeroSuccessors, mlir::OpTrait::VariadicOperands, mlir::OpTrait::SingleBlockImplicitTerminator<mlir::AffineYieldOp>::Impl, mlir::OpTrait::OpInvariants, mlir::OpTrait::AutomaticAllocationScope, mlir::OpTrait::HasRecursiveSideEffects, mlir::LoopLikeOpInterface::Trait, mlir::RegionBranchOpInterface::Trait>::getFoldHookFnImpl<mlir::AffineForOp>()::'lambda'(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) const>(void*, mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) /data/llvm15/llvm/include/llvm/ADT/FunctionExtras.h:222:12
#15 0x0000000002c59d3f llvm::unique_function<mlir::LogicalResult (mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) const>::operator()(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) const /data/llvm15/llvm/include/llvm/ADT/FunctionExtras.h:410:12
#16 0x0000000002c592bc mlir::RegisteredOperationName::foldHook(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) const /data/llvm15/mlir/include/mlir/IR/OperationSupport.h:334:12
#17 0x0000000002c51a0e mlir::Operation::fold(llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) /data/llvm15/mlir/lib/IR/Operation.cpp:491:31
#18 0x0000000002b3ee59 mlir::OpBuilder::tryFold(mlir::Operation*, llvm::SmallVectorImpl<mlir::Value>&) /data/llvm15/mlir/lib/IR/Builders.cpp:420:18
#19 0x0000000002a621bd (anonymous namespace)::OperationLegalizer::legalizeWithFold(mlir::Operation*, mlir::ConversionPatternRewriter&) /data/llvm15/mlir/lib/Transforms/Utils/DialectConversion.cpp:1923:23
#20 0x0000000002a61cb2 (anonymous namespace)::OperationLegalizer::legalize(mlir::Operation*, mlir::ConversionPatternRewriter&) /data/llvm15/mlir/lib/Transforms/Utils/DialectConversion.cpp:1885:17
#21 0x0000000002a6129f (anonymous namespace)::OperationConverter::convert(mlir::ConversionPatternRewriter&, mlir::Operation*) /data/llvm15/mlir/lib/Transforms/Utils/DialectConversion.cpp:2407:26
#22 0x0000000002a5a78b (anonymous namespace)::OperationConverter::convertOperations(llvm::ArrayRef<mlir::Operation*>, llvm::function_ref<void (mlir::Diagnostic&)>) /data/llvm15/mlir/lib/Transforms/Utils/DialectConversion.cpp:2456:16
#23 0x0000000002a5a457 mlir::applyPartialConversion(llvm::ArrayRef<mlir::Operation*>, mlir::ConversionTarget&, mlir::FrozenRewritePatternSet const&, llvm::DenseSet<mlir::Operation*, llvm::DenseMapInfo<mlir::Operation*, void>>*) /data/llvm15/mlir/lib/Transforms/Utils/DialectConversion.cpp:3270:22
#24 0x0000000002a5a9d2 mlir::applyPartialConversion(mlir::Operation*, mlir::ConversionTarget&, mlir::FrozenRewritePatternSet const&, llvm::DenseSet<mlir::Operation*, llvm::DenseMapInfo<mlir::Operation*, void>>*) /data/llvm15/mlir/lib/Transforms/Utils/DialectConversion.cpp:3276:10
#25 0x00000000024458b6 (anonymous namespace)::LowerAffinePass::runOnOperation() /data/llvm15/mlir/lib/Conversion/AffineToStandard/AffineToStandard.cpp:555:16
#26 0x00000000029bbf7a mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /data/llvm15/mlir/lib/Pass/Pass.cpp:471:21
#27 0x00000000029bc574 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /data/llvm15/mlir/lib/Pass/Pass.cpp:534:16
#28 0x00000000029bde3c mlir::PassManager::runPasses(mlir::Operation*, mlir::AnalysisManager) /data/llvm15/mlir/lib/Pass/Pass.cpp:837:10
#29 0x00000000029bdd5c mlir::PassManager::run(mlir::Operation*) /data/llvm15/mlir/lib/Pass/Pass.cpp:817:60
#30 0x00000000029b539c performActions(llvm::raw_ostream&, bool, bool, llvm::SourceMgr&, mlir::MLIRContext*, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>) /data/llvm15/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:84:17
#31 0x00000000029b50c3 processBuffer(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, bool, bool, bool, bool, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, llvm::ThreadPool*) /data/llvm15/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:124:12
#32 0x00000000029b4ecf mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, bool, bool, bool, bool, bool)::$_0::operator()(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&) const /data/llvm15/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:164:12
#33 0x00000000029b4ded mlir::LogicalResult llvm::function_ref<mlir::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>::callback_fn<mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, bool, bool, bool, bool, bool)::$_0>(long, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&) /data/llvm15/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#34 0x0000000002ae1789 llvm::function_ref<mlir::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>::operator()(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&) const /data/llvm15/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#35 0x0000000002ae0d65 mlir::splitAndProcessBuffer(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<mlir::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>, llvm::raw_ostream&, bool, bool) /data/llvm15/mlir/lib/Support/ToolUtilities.cpp:28:12
#36 0x00000000029b4159 mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, bool, bool, bool, bool, bool) /data/llvm15/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:169:10
#37 0x00000000029b429a mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, mlir::PassPipelineCLParser const&, mlir::DialectRegistry&, bool, bool, bool, bool, bool) /data/llvm15/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:187:10
#38 0x00000000029b4c06 mlir::MlirOptMain(int, char**, llvm::StringRef, mlir::DialectRegistry&, bool) /data/llvm15/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:269:14
#39 0x0000000000411c2e main /data/llvm15/mlir/tools/mlir-opt/mlir-opt.cpp:237:7
#40 0x00007f7575cacc87 __libc_start_main /build/glibc-CVJwZb/glibc-2.27/csu/../csu/libc-start.c:344:0
#41 0x0000000000411aca _start (/data/llvm15/mlir/build/bin/mlir-opt+0x411aca)
Aborted (core dumped)

In fact, when i removed -lower-affine, tried following cmd, there will be no problem with execution.

mlir-opt temp.mlir 
 -pass-pipeline=func.func(tosa-to-linalg) 
 -linalg-bufferize -convert-linalg-to-affine-loops 
 -affine-loop-coalescing 
 -affine-data-copy-generate=generate-dma=false 
 -o temp1.mlir

However, I tried to continue lowering affine dialect with -lower-affine, and it crashed. I think any mlir with affine dialect can be lowered, regardless of whether optimizations or other transformations have been applied. There may be a potential bug.

For further verification, I reduce temp1.mlir to minimized test case temp2.mlir:

#map4 = affine_map<() -> ()>
module {
  func.func @test_greater(%arg0: tensor<13x21x1xf32>, %arg1: tensor<13x21x3xf32>) {
    %1 = tensor.collapse_shape %arg0 [[0], [1, 2]] : tensor<13x21x1xf32> into tensor<13x21xf32>
    %2 = bufferization.to_memref %1 : memref<13x21xf32>
    %9 = memref.alloc() : memref<13x21xf32, 1>
    affine.for %arg2 = max #map4() to min #map4() {
      affine.for %arg3 = 0 to 21 {
        %13 = affine.load %2[%arg2, %arg3] : memref<13x21xf32>
        affine.store %13, %9[%arg2, %arg3] : memref<13x21xf32, 1>
      }
    }
    return
  }
}

when I run cmd mlir-opt temp2.mlir -lower-affine, it crashed the same as above. I think I just encountered a bug. I have summited this bug to Github: [MLIR] Crash when using -lower-affine.

Below is another problem I encountered.
I tried to change the parameter of pass -affine-data-copy-generate to make lowering succeeds, and I used -affine-data-copy-generate=generate-dma. Then, tosa dialect can be lowered to scf dialect successfully, but there are still shown some errors.

Steps to reproduce:

mlir-opt temp.mlir -pass-pipeline="func.func(tosa-to-linalg)"  -linalg-bufferize  -convert-linalg-to-affine-loops -affine-loop-coalescing  -affine-data-copy-generate="generate-dma" -o temp1.mlir 
| mlir-opt temp1.mlir -lower-affine

The error information is like:

temp1.mlir:36:13: error: semi-affine expressions (modulo by non-const) are not supported
      %16 = affine.apply #map4(%arg2)[%7]
            ^
temp1.mlir:37:13: error: semi-affine expressions (division by non-const) are not supported
      %17 = affine.apply #map5(%arg2)[%7]
            ^
temp1.mlir:38:13: error: semi-affine expressions (modulo by non-const) are not supported
      %18 = affine.apply #map4(%17)[%5]
            ^
temp1.mlir:39:13: error: semi-affine expressions (division by non-const) are not supported
      %19 = affine.apply #map5(%17)[%5]
            ^
#map0 = affine_map<(d0)[s0] -> (d0 mod s0)>
#map1 = affine_map<(d0)[s0] -> (d0 floordiv s0)>
module {
  func.func @test_greater(%arg0: tensor<13x21x1xf32>, %arg1: tensor<13x21x3xf32>) -> tensor<13x21x3xi1> {
    %c819 = arith.constant 819 : index
    %c0 = arith.constant 0 : index
    %c819_0 = arith.constant 819 : index
    %c0_1 = arith.constant 0 : index
    %c273 = arith.constant 273 : index
    %c0_2 = arith.constant 0 : index
    %c0_3 = arith.constant 0 : index
    %0 = bufferization.to_memref %arg1 : memref<13x21x3xf32>
    %1 = tensor.collapse_shape %arg0 [[0], [1, 2]] : tensor<13x21x1xf32> into tensor<13x21xf32>
    ...

How do I use the pass ‘-affine-data-copy-generate’ rightly? Should I select parameters based on mlir input? I’m confused about the second problem, why lowering can succeed despite the error reported by the verifier.

I hope someone can help me with these problems.

Thanks,
Bealle