Dump IR after each pass into separate file

Hi all,

In my previous projects, we always have ability to dump IR after each applied optimization to a separate file.

Thereby, you can just look into the folder of needed method and compare IR files to figure out, for example, which optimization removes some instruction. Or use grep to find where the instruction is appeared first. Or use vimdiff to look at how the specific optimization changes the IR. And so on, there are many ways to use these separate files.

Although, many compilers don’t have such ability, LLVM is one of them. They dump IR just to stdout.
So my question is how do you debug optimizations? Just look into this big stdout file and do scrolling, bookmarking, etc? Or maybe there is a tool that shows the stdout in a more pretty way?

Most of the code around this sort of thing is in StandardInstrumentations.cpp. See -print-changed, which only prints when there are changes to the IR. There’s a variant -print-changed=dot-cfg which dumps the IR as a graph to pdf files. This could be extended to dump textual IR, or really whatever you want.

1 Like

I use this for post-processing. Not nice but worked for me so far.
Be aware, non-module passes will result in files containing only partial modules.

#!/usr/bin/env python
import os, sys

mod_dir = 'modules'
mod_no = 0

if not os.path.isdir(mod_dir):
  os.mkdir(mod_dir)
fd = open(sys.argv[1], 'r')
lines = fd.readlines()
fd.close()
legend = open(mod_dir + '/' + 'legend', 'w')

skip = 0
if len(sys.argv) > 2:
  skip = int(sys.argv[2])

opening_banner = '*** IR Dump After'

for line in lines:
  if line.startswith(opening_banner):
    if skip > 0:
        skip -= 1
    if mod_no:
        fd.close()
    fd = open('%s/m%i.ll' % (mod_dir, mod_no), 'w')
    legend.write('m%i.ll : %s\n' %(mod_no, line))
    mod_no += 1
    continue
  if skip:
    continue
  fd.write(line)

legend.close()
fd.close()

I’d be much happier if we had an option to dump the IR/MIR into separate files. The current -print-before/-print-after is particularly nasty for MIR since it uses the less structured debug printing syntax and doesn’t really go through the MIR printer

Here is my two cents: print-module-scope allows to dump whole module instead of function bodies only, filter-print-funcs to filter which functions we want to see in dumps.

Generally, I agree that dumping per-optimization can be sometimes more useful. But I remember practice when I have to open a lot of files just to discover where a change happens. This scenario is more convenient with current LLVM’s approach.

1 Like

Try llvm/utils/chunk-print-before-all.py

Thanks, I didn’t know about that. I have also written my own:

#!/usr/bin/awk -f

# Split IR Dumps into separate files

/\*\*\* IR Dump/ {
    f = gensub(/.*\*\*\* IR Dump (.*) \*\*\*.*/, "\\1", "g")
    f = gensub(/[^A-Za-z0-9]/, "-", "g", f)
    f = gensub(/--*/, "-", "g", f)
    f = gensub(/-$/, "", "g", f)
    f = sprintf("%03d-%.200s", n, f)
    print f
    ++n
}

{
    if (f) print >> f
}

I wrote up some basics a while back, see: public-notes/llvm-debugging-basics.rst at master · preames/public-notes · GitHub

2 Likes

For your and anybody else’s awareness, there’s the -print-module-scope command line option which changes this behavior. It has occasionally saved me a lot of headaches…

As for your first point of using Compiler Explorer, note that they now have an LLVM pipeline view. It’s incredibly useful, and I have used it a lot.

It looks like this:

On the left side, you’ll see all passes that change the IR, and you can select each pass and see how it transforms the IR on the right.

3 Likes