MIR YAML deserialisation failure


I am trying to isolate an assertion failure in if-converter (on PPC) and I generated a textual debuglog with:

LLVM_ARGS=-print-before-all -print-module-scope -filter-print-funcs=japi1__require_7687

and after splicing out the the MIR before the if-converter pass
I would like to run llc -march=ppc64le -run-pass=if-converter input.mir so that I can start minimising the MIR.

This steps fails for me with a:

error: YAML:188:20: Unrecognized character while tokenizing.
Function Live Ins: %x4

error: YAML:188:1: Map value must not be empty
Function Live Ins: %x4

Should I expect this to work, or is some part of my workflow wrong?

I put the full log and just the extracted MIR file online:
https://drive.google.com/open?id=1Br0s9Qvr4tzPv8nqbnV_nWezpEH5Ci7B and would appreciate any guidance whether I should file this as a deserialiser bug.


Hello Valentin,

To generate a mir test case i think the process is to first create an IR file by passing ‘-S -emit-llvm’ to clang, then you can feed that file into llc and use stop-before to get the mir just before the if-converter pass, eg: llc -stop-before=if-converter -simplify-mir -o test.mir test.ll.

Also there is a MIR language reference: https://llvm.org/docs/MIRLangRef.html which has some of the limitations documented, as well as tips for further simplifying the generated mir if need be.




in terms of limitations as Sean pointed out, an important one is that .mir doesn’t have MachineFunctionInfo which may result in failure on accesses to global variables due to use of register X2. The verifier considers it an undefined register.

Also, it’s probably easier to reduce test cases using bugpoint starting from an IR test case. With the code you provided, I get a different crash than what you apparently get. Here are the test case and the backtrace.

Test case: https://pastebin.com/fxjRtJD0

Backtrace: https://pastebin.com/FsXMvbGK


Thank you both!

I was running into the issue that bugpoint was reducing my test-case into other failures and I hadn’t managed yet to find the right point in the Julia pass pipeline to insert the module to reproduce the issue reliably from llc and that’s why I started looking at using the MIR.

I will go back at looking at the pass pipeline and the IR and get a reproducer that way!


I’m not sure if this helps, but here it is in case it does.

I typically use bugpoint in a way as to keep the actual failure that I’m after. For example, with the test case you’ve pasted, I was looking for a specific assert. So I used bugpoint this way:

$ cat reduceme.sh
llc -filetype=obj $1 2>&1 | grep ‘Cannot lower calls with arbitrary operand bundles’
if [[ $RC -eq 0 ]]
exit 1
exit 0

$ bugpoint -compile-custom -compile-command=$PWD/reduceme.sh input.ll

That will ensure that bugpoint retains the specific desired failure behaviour as it is reducing the test case.

Hope that helps,


P.S. It is not that I am specifically interested in that assert - just that that’s how the original test case was failing so that’s what I wanted to preserve as an illustration.

In our fork of LLVM we often need to reduce a crash testcase for a
specific assertion. After writing lots of "only give me this specific
assertion" scripts like the above I decided to write a script that
automates all this for me:
(It's called creduce_crash_test.py but it will actually use bugpoint
-compile-custom if it detects that it is not a clang frontend crash).
If you point the script at a clang crash reproducer .sh file it will
infer the assertion message/llvm_unreachable/infinite loop (if it
takes longer than x seconds it treats it as an infinite loop) and then
try to produce a reduced test case for that issue. If you point it at
a non.sh file it will parse that file for RUN: lines and try to reduce
the crashes it finds there. It currently handles RUN: lines containing
llc or %clang/%clang_cc1 and our custom lit substitutions.

If people are interested in this I'm very happy to try and submit this upstream.

I for one think this sounds like a very useful tool for LLVM developers. I’ll try it out with an actual investigation.

Seems to me like this could be a valuable addition to the utils/ directory. From my experience with tools such as this, it is the existence of clear documentation that makes or breaks the tool. If it is possible, I think it would be good to start with a document describing the tool’s capabilities and instructions for use along with the code itself. This would allow people with very limited Python knowledge (such as myself) to evaluate the tool as well.

Lets see if others in the community are interested in this tool joining the LLVM arsenal, but I would certainly like to thank you for sharing it and offering to contribute it.