GDB pretty printers for LLVM ADTs

In r271357 I’ve committed GDB pretty printer script for the following types

  • ArrayRef
    $1 = llvm::ArrayRef of length 3 = {1, 0, 7}

  • StringRef
    $2 = “foo\000bar”
    $3 = “fo”

  • SmallString
    $4 = “foo\000bar”

  • SmallVector(Impl)
    $5 = llvm::SmallVector of length 3, capacity 3 = {1, 0, 7}
    All of these visualizers are pretty simple, robust, should work when debugging a core dump (ie: non-live/not executing any code in the debug process).

Note that the StringRef visualizer now properly terminates the string at the specified length - so it can both include null characters, or terminate prematurely before them. Same with StringPiece.

Syntax/styling for ArrayRef and SmallVector are based on the print styling of std::vector. If people feel like the smallness should be included in any of the Small* types, we could include that. And if the element type of sequences is important, we can do that too - but since std::vector doesn’t include the element type, I figured that was a good thing to go off and as we improve the pretty printers for more types the elements should be self explanatory.

Patches and feedback welcome to add more visualizers, improve existing ones, etc.

How do you access/use these pretty printers?

Currently, the simplest thing to do is:

source /path/to/your/llvm/src/utils/gdb-scripts/prettyprinters.py

You can put this in your ~/.gdbinit if you want to make it more convenient.

Possible Usability Improvements

Just sourcing in .gdbinit doesn’t scale if you have multiple LLVM source trees or debug builds that might be out of sync with one another (the ABI of the ADTs probably doesn’t change all that often that this would be very annoying - but Clang’s ASTs, LLVM IR nodes, etc might change often enough that this approach would come up for users in this situation). Also having to go in and another line whenever we add other scripts - Clang pretty printers, LLD pretty printers, etc, would be unfortunate.

.debug_gdb_scripts provides a way to load these scripts on demand, which would ensure that we didn’t need to update each of our local .gdbinit files every time a new set of pretty printers (probably on a per project basis - one for clang, one for LLVM, one for LLD, etc) was added.

Though it does still require a .gdbinit change (or manual command) to adjust the auto-load path to allow the scripts to be automatically loaded.

Depending on how it’s used, it doesn’t entirely fix the “what if I have multiple different LLVM things I’m debugging with incompatible pretty printers”. Using just the script name version doesn’t help because the paths the scripts are loaded from wouldn’t distinguish between which source tree.

But if we use the inline script version - the right version is embedded in the binary & available every time. This gets a bit ugly - writing Python inside inline asm isn’t really comfortable (even with C++ raw string literals to get past one level of escaping), but we could use incbin assembler directive to include a Python source file straight into the inline asm in the right place. Only problem: Clang’s incbin support is buggy in a way that makes this not work (disabling the integrated assembler does demonstrate this working successfully).

Appendix of debug_gdb_scripts examples, so I don’t have to keep these somewhere else for reference. Obviously these aren’t terribly portable, but on platforms where they don’t work (so long as we #define them away) users are no worse off there - people can still source the python script directly as described above:

  1. basic usage:

Pretty printers was something that I was really missing when debugging. Thanks David!