What is the right way to start debugging analysis of large IR?

Hello,

So the question is, as the title states… what is the typical good way of debugging errors obtained from analyzing a large IR file?

I’m asking this question because I have finished developing the analysis which converts stack-to-heap, and during my process of testing this on many micro/macro-test applications, I can get it more or less working on small (simple epoll example) to a medium-sized application such as ProFTPD (using wllvm).

However, regarding a larger application such as Apache HTTPD, I get an error like this after analysis completes.

Finished analysis
opt: /home/llvm-project-13/llvm/lib/Bitcode/Writer/ValueEnumerator.cpp:524: unsigned int llvm::ValueEnumerator::getValueID(const llvm::Value*) const: Assertion `I != ValueMap.end() && "Value not in slotcalculator!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/llvm-project-13/llvm-build/bin/opt -load /home/build/lib/libstacktoheap.so -load-pass-plugin /home/build/lib/libstacktoheap.so -passes=stacktoheap /home/inputs/httpd.bc -o /home/results/httpd/httpd.bc
 #0 0x00005591abaa381f PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
 #1 0x00005591abaa0fbe SignalHandler(int) Signals.cpp:0:0
 #2 0x00007f3009104730 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12730)
 #3 0x00007f3008beb8eb raise /build/glibc-6iIyft/glibc-2.28/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #4 0x00007f3008bd6535 abort /build/glibc-6iIyft/glibc-2.28/stdlib/abort.c:81:7
 #5 0x00007f3008bd640f _nl_load_domain /build/glibc-6iIyft/glibc-2.28/intl/loadmsgcat.c:1177:9
 #6 0x00007f3008be41a2 (/lib/x86_64-linux-gnu/libc.so.6+0x301a2)
 #7 0x00005591aab9e131 llvm::ValueEnumerator::getValueID(llvm::Value const*) const (/home/llvm-project-13/llvm-build/bin/opt+0x121f131)
 #8 0x00005591aab87038 (anonymous namespace)::ModuleBitcodeWriter::writeInstruction(llvm::Instruction const&, unsigned int, llvm::SmallVectorImpl<unsigned int>&) BitcodeWriter.cpp:0:0
 #9 0x00005591aab93819 (anonymous namespace)::ModuleBitcodeWriter::write() BitcodeWriter.cpp:0:0
#10 0x00005591aab959fa llvm::BitcodeWriter::writeModule(llvm::Module const&, bool, llvm::ModuleSummaryIndex const*, bool, std::array<unsigned int, 5ul>*) (/home/llvm-project-13/llvm-build/bin/opt+0x12169fa)
#11 0x00005591aab95bbd llvm::WriteBitcodeToFile(llvm::Module const&, llvm::raw_ostream&, bool, llvm::ModuleSummaryIndex const*, bool, std::array<unsigned int, 5ul>*) (/home/llvm-project-13/llvm-build/bin/opt+0x1216bbd)
#12 0x00005591aab799f8 llvm::BitcodeWriterPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/llvm-project-13/llvm-build/bin/opt+0x11fa9f8)
#13 0x00005591aa00d0ed llvm::detail::PassModel<llvm::Module, llvm::BitcodeWriterPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/llvm-project-13/llvm-build/bin/opt+0x68e0ed)
#14 0x00005591ab21c947 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/llvm-project-13/llvm-build/bin/opt+0x189d947)
#15 0x00005591aa0176cc llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::StringRef>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool) (/home/llvm-project-13/llvm-build/bin/opt+0x6986cc)
#16 0x00005591a9f9cc1f main (/home/llvm-project-13/llvm-build/bin/opt+0x61dc1f)
#17 0x00007f3008bd809b __libc_start_main /build/glibc-6iIyft/glibc-2.28/csu/../csu/libc-start.c:342:3
#18 0x00005591aa00b0fa _start (/home/llvm-project-13/llvm-build/bin/opt+0x68c0fa)
tag.sh: line 28: 51567 Aborted                 $LLVM_DIR/bin/opt -load $LIB_DIR/libstacktoheap.so -load-pass-plugin $LIB_DIR/libstacktoheap.so -passes=stacktoheap $INPUT_FILE_LOC -o $RESULTS_FILE_LOC
clang-13: error: no such file or directory: '/home/results/httpd/httpd.bc'
clang-13: error: no input files
clang-13: error: no such file or directory: '/home/results/httpd/httpd.bc'
/home/llvm-project-13/llvm-build/bin/llvm-dis: error: No such file or directory

For what it is worth, I feel this might have something to do with me trying to rewrite a bitcode using an input of a single file (I don’t think this is related to wllvm at all, but just that I’m trying to perform rewriting analysis on a bitcode file with everything combined which can/could break dependencies that were linked during wllvm process).

There are a few hints here and there in this error log, such as this error is happening near ModuleBitcodeWriter::writeInstruction, which means the analysis is completed in terms of instrumenting instructions and whatnot but cannot generate the new IR.

My last thought is that I am certain that although the analysis successfully completes, it must have done something wrong during inserting/removing instructions. Still, because this error log is happening after the analysis, I’m a bit lost on how to start debugging through this.

I would appreciate any tips on how to get started on deep-dive debugging of LLVM analysis for the cases like this.

llvm-reduce is generally the tool to use to cut large IR test cases down to small ones.

Thank you for your reply. I tried using llvm-reduce (and watched 2019 LLVM Developers’ Meeting: D. Ferrer “LLVM-Reduce for testcase reduction” - YouTube), but it seems like this tool is not the best one to use for me at the moment.

After I did some additional searching, there were suggestions on using --verify-each option during the optimization phase, which gave me an output of

Finished analysis
LLVM ERROR: Broken module found, compilation aborted!
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:

So now I am able to get an output from the clang saying that there is indeed a broken module which is the cause of the compilation being aborted.

Therefore, my question is there any way to figure out which module is the one that is broken? I have not found a way to make this error log more verbose than it is right now. I also wonder whether I need to use the LLVM Verifier library to make it work, but I figured to get an insight if possible.

Thank you in advance again for any suggestions