Help on debugging IntegerSet crash

Hi,

I have some code that implements a custom affine pass, it used to work until I rebased it on top of a more recent commit. I am still very much a beginner, so I’d be very grateful if someone could point me in the right direction to go from “it doesn’t work” to a proper bug report/fix.

The line where everything breaks is the creation of an IntegerSet:
IntegerSet set = IntegerSet::get(mapDims, mapSyms, constraints, eqFlags);

where mapDims = 1, mapSyms = 0, constraints is an ArrayRef of size 1, and eqFlags is an ArrayRef of size 1. This is the crash backtrace:

Stack dump:
0.      Program arguments: mlir-opt input.mlir "-pass-pipeline=func.func(test-custom-affine-pass)" -o=output.mlir
 #0 0x000055b1d8c88fbd llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (.localalias) /home/llvm-project/llvm/lib/Support/Unix/Signals.inc:573:3
 #1 0x000055b1d8c8717c llvm::sys::RunSignalHandlers() (.localalias) /home/llvm-project/llvm/lib/Support/Signals.cpp:103:20
 #2 0x000055b1d8c87306 SignalHandler(int) /home/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
 #3 0x00007f74d9fd0200 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12200)
 #4 0x000055b1d9e8c7b6 LookupBucketFor<mlir::TypeID> /home/llvm-project/llvm/include/llvm/ADT/DenseMap.h:631:22
 #5 0x000055b1d9e8c7b6 count /home/llvm-project/llvm/include/llvm/ADT/DenseMap.h:149:27
 #6 0x000055b1d9e8c7b6 mlir::detail::StorageUniquerImpl::getOrCreate(mlir::TypeID, unsigned int, llvm::function_ref<bool (mlir::StorageUniquer::BaseStorage const*)>, llvm::function_ref<mlir::StorageUniquer::BaseStorage* (mlir::StorageUniquer::StorageAllocator&)>) /home/llvm-project/mlir/lib/Support/StorageUniquer.cpp:281:5
 #7 0x000055b1d9e8c7b6 mlir::StorageUniquer::getParametricStorageTypeImpl(mlir::TypeID, unsigned int, llvm::function_ref<bool (mlir::StorageUniquer::BaseStorage const*)>, llvm::function_ref<mlir::StorageUniquer::BaseStorage* (mlir::StorageUniquer::StorageAllocator&)>) /home/llvm-project/mlir/lib/Support/StorageUniquer.cpp:348:27
 #8 0x000055b1d9f3a8bc mlir::IntegerSet::get(unsigned int, unsigned int, llvm::ArrayRef<mlir::AffineExpr>, llvm::ArrayRef<bool>) /home/llvm-project/mlir/lib/IR/MLIRContext.cpp:1023:1
 #9 0x000055b1d9e85a03 mlir::OpState::getOperation() /home/llvm-project/mlir/include/mlir/IR/OpDefinition.h:91:38
#10 0x000055b1d9e85a03 mlir::conditionLoopExecution(mlir::AffineForOp*, mlir::LoopSchedule&) /home/llvm-project/mlir/lib/Transforms/Scheduling/SchedulingUtils.cpp:76:42
#11 0x000055b1d9d4c9e7 (anonymous namespace)::TestLoopSchedulingPass::runOnOperation() /home/llvm-project/mlir/test/lib/Transforms/TestLoopSchedulingPass.cpp:74:29
#12 0x000055b1d9dfc70a mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (.localalias) /home/llvm-project/mlir/lib/Pass/Pass.cpp:470:25
#13 0x000055b1d9dfcad2 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (.localalias) /home/llvm-project/mlir/lib/Pass/Pass.cpp:534:5
#14 0x000055b1d9dfbb4f std::vector<std::atomic<bool>, std::allocator<std::atomic<bool> > >::operator[](unsigned long) /usr/include/c++/10/bits/stl_vector.h:1046:25
#15 0x000055b1d9dfbb4f operator() /home/llvm-project/mlir/lib/Pass/Pass.cpp:759:22
#16 0x000055b1d9dfbb4f failableParallelForEach<__gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::<lambda(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&)>&> /home/llvm-project/mlir/include/mlir/IR/Threading.h:46:17
#17 0x000055b1d9dfbb4f failableParallelForEach<std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo>&, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::<lambda(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&)>&> /home/llvm-project/mlir/include/mlir/IR/Threading.h:92:33
#18 0x000055b1d9dfbb4f mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool) (.localalias) /home/llvm-project/mlir/lib/Pass/Pass.cpp:764:13
#19 0x000055b1d9dfc555 llvm::optional_detail::OptionalStorage<mlir::detail::PassExecutionState, false>::value() & /home/llvm-project/llvm/include/llvm/ADT/Optional.h:98:5
#20 0x000055b1d9dfc555 llvm::Optional<mlir::detail::PassExecutionState>::getPointer() /home/llvm-project/llvm/include/llvm/ADT/Optional.h:304:42
#21 0x000055b1d9dfc555 llvm::Optional<mlir::detail::PassExecutionState>::operator->() /home/llvm-project/llvm/include/llvm/ADT/Optional.h:314:38
#22 0x000055b1d9dfc555 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (.localalias) /home/llvm-project/mlir/lib/Pass/Pass.cpp:471:36
#23 0x000055b1d9dfcad2 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (.localalias) /home/llvm-project/mlir/lib/Pass/Pass.cpp:534:5
#24 0x000055b1d9dfd890 mlir::PassManager::runPasses(mlir::Operation*, mlir::AnalysisManager) /home/llvm-project/mlir/lib/Pass/Pass.cpp:838:71
#25 0x000055b1d9dfd890 mlir::PassManager::run(mlir::Operation*) /home/llvm-project/mlir/lib/Pass/Pass.cpp:817:76
#26 0x000055b1d9db5f75 performActions(llvm::raw_ostream&, bool, bool, llvm::SourceMgr&, mlir::MLIRContext*, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>) (.constprop.0) /home/llvm-project/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:84:3
#27 0x000055b1d9db6362 processBuffer(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, bool, bool, bool, bool, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, llvm::ThreadPool*) /home/llvm-project/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:123:68
#28 0x000055b1d9db652b std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >::~unique_ptr() /usr/include/c++/10/bits/unique_ptr.h:360:12
#29 0x000055b1d9db652b operator() /home/llvm-project/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:167:36
#30 0x000055b1d9db652b mlir::LogicalResult llvm::function_ref<mlir::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, llvm::raw_ostream&)>::callback_fn<mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, bool, bool, bool, bool, bool)::'lambda'(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, llvm::raw_ostream&)>(long, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, llvm::raw_ostream&) /home/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:52
#31 0x000055b1d9e94e92 std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >::~unique_ptr() /usr/include/c++/10/bits/unique_ptr.h:360:12
#32 0x000055b1d9e94e92 llvm::function_ref<mlir::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, llvm::raw_ostream&)>::operator()(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, llvm::raw_ostream&) const /home/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:20
#33 0x000055b1d9e94e92 mlir::splitAndProcessBuffer(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, llvm::function_ref<mlir::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, llvm::raw_ostream&)>, llvm::raw_ostream&, bool, bool) /home/llvm-project/mlir/lib/Support/ToolUtilities.cpp:28:60
#34 0x000055b1d9db5ba9 std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >::~unique_ptr() /usr/include/c++/10/bits/unique_ptr.h:360:12
#35 0x000055b1d9db5ba9 mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, bool, bool, bool, bool, bool) (.localalias) /home/llvm-project/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:170:77
#36 0x000055b1d9db687c std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >::~unique_ptr() /usr/include/c++/10/bits/unique_ptr.h:360:12
#37 0x000055b1d9db687c mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, mlir::PassPipelineCLParser const&, mlir::DialectRegistry&, bool, bool, bool, bool, bool) /home/llvm-project/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:189:73
#38 0x000055b1d9db687c mlir::MlirOptMain(int, char**, llvm::StringRef, mlir::DialectRegistry&, bool) /home/llvm-project/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:269:13
#39 0x000055b1d8bf2646 std::vector<std::unique_ptr<mlir::DialectExtensionBase, std::default_delete<mlir::DialectExtensionBase> >, std::allocator<std::unique_ptr<mlir::DialectExtensionBase, std::default_delete<mlir::DialectExtensionBase> > > >::~vector() /usr/include/c++/10/bits/stl_vector.h:680:15
#40 0x000055b1d8bf2646 mlir::DialectRegistry::~DialectRegistry() /home/llvm-project/mlir/include/mlir/IR/DialectRegistry.h:108:7
#41 0x000055b1d8bf2646 main /home/llvm-project/mlir/tools/mlir-opt/mlir-opt.cpp:234:19
#42 0x00007f74d9a2a81d __libc_start_main ./csu/../csu/libc-start.c:332:16
#43 0x000055b1d8c6fcca _start (/home/llvm-project/build/bin/mlir-opt+0x3accca)
compile.sh: line 3: 488696 Segmentation fault      mlir-opt "input.mlir" -pass-pipeline="func.func(test-custom-affine-pass)" -o="output.mlir"

I’ll be working on a minimal example to reproduce the issue, but in the meantime maybe someone knows something about recent changes to IntegerSets?

Seems like a TypeID mismatch. Have you tried building LLVM statically (in case you don’t)?
See: https://github.com/llvm/llvm-project/issues/52621

(I also had almost the same issue, but on a different platform)

Your issue is very interesting, because I did add some StringAttrs to my code recently! I just had no idea they could be involved here, because in other test cases the code works fine, and there is no clear indication in the stack trace.

For now I verified that building in debug mode makes the crash disappear, now I have to figure out what needs to change. Thanks!

StorageUniquers are involved every time you build an operation, a type or an attribute, so it’s not really a StringAttr or IntegerSet problem but rather a mismatch between the different shared libraries on the TypeIDs that is used to identify the same object.

You can see the problem in these lines of the stack trace:

 #4 0x000055b1d9e8c7b6 LookupBucketFor<mlir::TypeID> /home/llvm-project/llvm/include/llvm/ADT/DenseMap.h:631:22
 #5 0x000055b1d9e8c7b6 count /home/llvm-project/llvm/include/llvm/ADT/DenseMap.h:149:27
 #6 0x000055b1d9e8c7b6 mlir::detail::StorageUniquerImpl::getOrCreate(mlir::TypeID, unsigned int, llvm::function_ref<bool (mlir::StorageUniquer::BaseStorage const*)>, llvm::function_ref<mlir::StorageUniquer::BaseStorage* (mlir::StorageUniquer::StorageAllocator&)>) /home/llvm-project/mlir/lib/Support/StorageUniquer.cpp:281:5
 #7 0x000055b1d9e8c7b6 mlir::StorageUniquer::getParametricStorageTypeImpl(mlir::TypeID, unsigned int, llvm::function_ref<bool (mlir::StorageUniquer::BaseStorage const*)>, llvm::function_ref<mlir::StorageUniquer::BaseStorage* (mlir::StorageUniquer::StorageAllocator&)>) /home/llvm-project/mlir/lib/Support/StorageUniquer.cpp:348:27
 #8 0x000055b1d9f3a8bc mlir::IntegerSet::get(unsigned int, unsigned int, llvm::ArrayRef<mlir::AffineExpr>, llvm::ArrayRef<bool>) /home/llvm-project/mlir/lib/IR/MLIRContext.cpp:1023:1

The StorageUniquer is requesting a TypeID that is not being found in the DenseMap.
Try to go for a static build of your project and see if it fixes the problem.

It was already a static build :confused: I’ll dig around more, maybe I am missing some MLIR_DEFINE_EXPLICIT_TYPE_ID somewhere.

A few things:

  • I thought we’d fixed the shared library/type Id thing. If this isn’t the case, a repro would help (@River707 for visibility)
  • It shouldn’t be possible to mess up the type IDs such that a static build misbehaves like this.
  • Can you enable/repro with asserts? I imagine there should be an asset triggering.

Apologies, I am really not an expert. I tried to come up with a minimal example here:
TestTypeIDissue.cpp (1.3 KB)

My build configuration is:
cmake -G Ninja ../llvm -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_ENABLE_PROJECTS=mlir -DLLVM_BUILD_EXAMPLES=ON -DLLVM_TARGETS_TO_BUILD="X86;NVPTX;AMDGPU" -DLLVM_ENABLE_ASSERTIONS=ON -DBUILD_SHARED_LIBS=OFF

  MLIRContext *context;

  AffineExpr setExpression = getAffineBinaryOpExpr(
    AffineExprKind::Add, getAffineDimExpr(0, context),
      getAffineConstantExpr(3, context));

Not sure if this is what you are doing in your code, but the context there isn’t valid. It’s currently an uninitialized pointer. If you want a valid context you’ll want to change that to something like:

MLIRContext *context = &getContext();

Building with asan (address sanitizer) would likely catch this and diagnose it for you.

Yeah it is not, in my code I take the context from the operation I am modifying. I just did not know how to create a fake one for the example.

Is your code open? Can you push this to a GitHub branch somewhere?

I pushed the code to Files · experimental/loop_pipelining · SODALite / soda-opt · GitLab. It is a work in progress, but there should be everything to reproduce the issue.

The test that breaks is test/Transforms/schedule-loop-lb.mlir · experimental/loop_pipelining · SODALite / soda-opt · GitLab, and the details about how I am building llvm are in the build_tools folder. I am a bit behind with respect to the main repo, on this commit. (How close are we to a “stable” MLIR version?)

Could you please dump out the affine expressions in constraints right before IntetgerSet::get?

setExpression.dump() gives me -s0 + 7 (which is the correct one, this is what I am looking for). But trying to print constraints itself causes a crash, am I initializing it wrong?

setExpression = getAffineBinaryOpExpr(
        AffineExprKind::Add,
        getAffineConstantExpr(difference, forOp->getContext()), invert);
ArrayRef<AffineExpr> constraints{setExpression};

ArrayRef has no backing storage. Please use SmallVector<AffineExpr, 1>.

Thank you! I fixed it and everything works as it should now. I’ll double check because I think there are other places in the code where ArrayRef was used. (Sorry for the delay.)