Llvm-objdump: print instruction histogram

Hello,

I was interested in contributing an option to llvm-objdump to print a histogram of all (static) instructions present in the object, so that you could do and get something like this:

$ llvm-objdump  -d --mnemonic-hist test.o
..
Mnemonic histogram:
         ldr:   120 (27.3973%)
         mov:    96 (21.9178%)
         blx:    56 (12.7854%)
         bl:     31 (7.07763%)
         str:    26 (5.93607%)
         add:    18 (4.10959%)
         b:      12 (2.73973%)
         sub:    10 (2.28311%)
         cmp:     9 (2.05479%)
         ...

The question is where this should live:

  • This output example could also be achieved with a one-liner on the command line – not that straightforward and easy to read, but certainly doable.
  • This wouldn’t work when additional instruction information, like encoding width, needs to be printed. So a script would be required, which we could e.g. contribute to llvm/utils.
  • These two options wouldn’t beat the convenience of having a built-in option.

If there are any opinions on this, please add them here or on the review: âš™ D125008 [llvm-objdump] Print Mnemonic Histogram

Cheers.

1 Like

MLIR has a lot of Python bindings. Wouldn’t Python bindings help to write a lot of analysis tools without messing with objdump.

1 Like

I think that’s the question, whether this is messing with objdump or not. The change is very small, self-contained, and probably doesn’t get it in the way of anything in my opinion. The big plus would be convenience to just use a built-in option. But I see that the same could be achieved with a script driving objdump. I don’t know much about python bindings, but I could look into that.

This is cool!

This seems similar to the instruction mix remarks built into the ASMPrinter. Would be interesting to compare results from objdump and compiler-generated remarks.