Helper script for analyzing mir dumps

Some time ago I started writing downstream scripts to help out when examining mir dumps from llc. Typically what used to happen when analyzing a regression was that you do two runs with ‘llc -print-after-all’ one before the regression and one after. You end up with two text files in the range of 100k lines each and opening these in e.g. vimdiff or any other diff tool tends to be rather painful.

I am not aware of any tools to help alleviate this besides that it has been recently discussed in the following thread

So instead I though one could write a simple python library to create an in memory representation of each dump file and then put a fragment abstraction on top of that representing passes, functions and basic blocks. If one now defines operations like comparison, view and diff on these fragments one has the basis for creating a useful tool that fits ones workflow.

Now of course everyone has a different opinion for what type of tool is useful for them and I guess that is the main reason that none such exist (to my knowledge). With this in mind limiting the work to just a library that one can either use in python REPL mode or write some simple swizz-army-knife like tool around might be the best bet.

Anyway I found the following useful ⚙ D158825 [WIP] Helper script for analyzing mir dumps (library and small tool). If anyone agrees then feel free to add yourself as reviewer and help out finish the last 10%.

FWIW compiler explorer has the “LLVM Opt Pipeline” tool, which does a pretty good job of this. Taking the example from the helper script patch: Compiler Explorer

-print-after-all doesn’t actually produce MIR. It produces an almost the same looking but not quite debug format.

I also think we should just improve the debug infrastructure rather than hacking around whatever happens to be printed today. We should have the ability to dump actual MIR to files in the first place and not just spam everything to stderr