Print expanded debug metadata in IR for debugging LLVM

I’m wondering if there’s appetite for a flag that tells LLVM to print a brief version of debug info metadata nodes inline when printing IR.

The two cases that stand out to me are:

a) printing DILocation line, col, file, and inlinedAt fields inline after !dbg attachments,
b) printing DILocalVariable variable names inline in debug intrinsics.

This would make it easier to understand debug info in IR test cases. It would be especially useful for looking at -print-after-all traces because the metadata number for a given node may change after an optimisation pass, so following specific debug intrinsics can become quite a laborious task.

I am specifically proposing that this is a write-only option, printing illegal IR. Though I suppose we could consider using comments if it’s not too difficult to accommodate option (b) above?

Here’s a source code example and output sketch (note: inlining f0 into f1 and f1 into f2).

__attribute__((always_inline))
static void f0() { int a = 0; }

__attribute__((always_inline))
static void f1() {
  int a = 1;
  f0();
}

void f2() {
  int a = 2;
  f1();
}

Instead of what we have now:
clang -O0 -g -emit-llvm -S test.cpp -o -

...
define dso_local void @_Z2f2v() #0 !dbg !8 {
entry:
  %a.i1 = alloca i32, align 4
  %a.i = alloca i32, align 4
  %a = alloca i32, align 4
  call void @llvm.dbg.declare(metadata ptr %a, metadata !12, metadata !DIExpression()), !dbg !14
  store i32 2, ptr %a, align 4, !dbg !14
  call void @llvm.dbg.declare(metadata ptr %a.i, metadata !15, metadata !DIExpression()), !dbg !17
  store i32 1, ptr %a.i, align 4, !dbg !17
  call void @llvm.dbg.declare(metadata ptr %a.i1, metadata !19, metadata !DIExpression()), !dbg !21
  store i32 0, ptr %a.i1, align 4, !dbg !21
  ret void, !dbg !23
}
...

We’d get something like this:
clang -O0 -g -emit-llvm -S test.cpp -o - -mllvm -expand-dbg-metadata

...
define dso_local void @_Z2f2v() #0 !dbg !8 {
entry:
  %a.i1 = alloca i32, align 4
  %a.i = alloca i32, align 4
  %a = alloca i32, align 4
  call void @llvm.dbg.declare(metadata ptr %a, metadata !12 ('a'), metadata !DIExpression()), !dbg !14 (11: 7, test.cpp)
  store i32 2, ptr %a, align 4, !dbg !14 (11: 7, test.cpp)
  call void @llvm.dbg.declare(metadata ptr %a.i, metadata !15 ('a'), metadata !DIExpression()), !dbg !17 (6: 7, test.cpp) ->  (12: 3, test.cpp)
  store i32 1, ptr %a.i, align 4, !dbg !17 (6: 7, test.cpp) ->  (12: 3, test.cpp)
  call void @llvm.dbg.declare(metadata ptr %a.i1, metadata !19 ('a'), metadata !DIExpression()), !dbg !21 (2: 24, test.cpp) ->  (7: 3, test.cpp) ->  (12: 3, test.cpp)
  store i32 0, ptr %a.i1, align 4, !dbg !21 (2: 24, test.cpp) ->  (7: 3, test.cpp) ->  (12: 3, test.cpp)
  ret void, !dbg !23 (13: 1, test.cpp)
}
...

I’ve had a patch hanging around locally for a long time that essentially does, and I find it really useful.

Would anyone else find this useful? And, does anyone have any strong opinions on this proposed output format (including whether printing illegal IR is OK, even behind a flag).

Thanks,
Orlando

2 Likes

I like this idea!

And makes life easier when dumping instructions inside the debugger.

I’m more incline to the comment option. I’m not a big fan of super long lines in text files or terminals. Can you elaborate a little more about the difficulty regarding printing variable name in a comment? (My guess is that you need to accommodate both the line number and variable name digest in a single line of comment)

1 Like

Seems useful to me to have the option to add it as comments while printing IR. I have out-of-tree code that can be used with the AssemblyAnnotationWriter or addAsmPrinterHandler to annotate the IR, as it is being printed, with line number information and such.

1 Like

I think something like this could be very useful, but I wonder if everyone can agree on what nodes to inline and if that should maybe be configurable. I feel pretty strongly about never printing illegal IR, so if you could either literally inline N levels of nodes or — as others suggested — print comments, that might be better.

1 Like

It would definitely be useful! :smile: For my own debug info work locally, I run a filter pass which inlines these things and removes a lot of the noise:

define void @example(i32 %n, i32 %size, i32* %data) {
entry:
  %n.addr = alloca i32
  %size.addr = alloca i32
  %data.addr = alloca i32*
  %i = alloca i32
  %comp = alloca i32
  store i32 %n, i32* %n.addr
  @dbg.declare(i32* %n.addr, "n" l1), l1 c23
  store i32 %size, i32* %size.addr
  @dbg.declare(i32* %size.addr, "size" l1), l1 c35
  store i32* %data, i32** %data.addr
  @dbg.declare(i32** %data.addr, "data" l1), l1 c51
  @dbg.declare(i32* %i, "i" l2), l2 c7
  store i32 0, i32* %i, l2 c7
  br label %while.cond, l4 c3

The version I’m using does remove lots of attributes that would be needed in other cases, so it’s probably not as useful to others directly… but in any case, I definitely support the general idea!

I find it’s far easier to understand what’s happening with debug info when the variable names and such are shown inline.

1 Like

Thanks everyone! It sounds like it would be useful to others then - it’s nice to hear that others have their own out of tree approaches to this too. Summarising comments it looks like:

a) We should not print illegal IR. e.g. print the information in comments or actually inline the nodes,
b) We might be able to use AssemblyAnnotationWriter for this task.

By the sounds of it my local hack probably won’t cut it. I probably won’t be able to take a look at implementing a more robust solution right away, but I will add it to my TODO list for when I get a chance. If anyone else gets there before I do I’d be very happy to help review it!


I haven’t looked into the comment approach yet, I was just mentioning it offhand as a potential option.


I didn’t know about AssemblyAnnotationWriter - that sounds like a very sensible approach!