[RFC] Moving llvm.dbg.value out of the instruction stream

rnk · October 23, 2018, 1:43am

One of the major drawbacks of our current variable tracking debug info design is that it puts lots of llvm.dbg.value instructions in the instruction stream. After the debug info BoF at the 2018 dev meeting (last week), I started to seriously consider a design where we track variable locations on the side of the instruction stream. People seemed to agree this idea was good enough to at least explore, so I put together some exploratory data and ideas.

Our current design tracks variable values with llvm.dbg.value instructions, which live in the instruction stream. This is crummy, because passes have to know to filter them. Today we have filtering iterators, but most passes have not been audited to use them, and we regularly have bugs where enabling debug info changes what the optimizer does. If we move variable tracking out of the instruction stream, we can fix this entire class of bugs by design.

Second, the current design slows down basic block iteration. My intuition says that, after inlining, we have tons of dbg.value instructions clogging up our basic blocks, slowing down our standard optimizer pipeline. Linked list iteration isn’t fast on modern architectures, so if dbg.values are an appreciable source of elements, moving it out of the list will speed up compilation and make -g less of a compile-time bear.

One thing this change will not address is the kind of use-before-def bugs we get today where dbg info unaware passes sink or hoist SSA instructions across the dbg.value that uses them. It has been suggested that we should attach the dbg.value directly to the instruction producing the value, but I think this is a mistake, because it only handles the common case, and not the special case where the value is produced far from the point of the assignment. Even if we have the special case attachment for value-producing instructions, we need new APIs to help passes decide how to update the debug info when sinking or hoisting across blocks.

Data & Motivation

MatzeB · October 23, 2018, 4:40am

When working on CodeGen I was also pondering why we have DBG_VALUEs in the schedule of our functions, when they are really about the schedule in our source code which isn’t necessarily the same… I also reached the conclusion that I’d rather have a side table with a schedule for the debug variables (whatever that would mean in practice).

I really like the idea presented here to attach the debug_values to the instructions producing the values!

On the CodeGen side I would expect things like copy coalescing, spilling/reloading, scheduling, live range splitting to become easier as we no longer need to rewrite instructions somewhere else in the schedule but just need to copy debug-var annotations around as we create the copies, spills, reloads or merge annotations in case of coalescing. It also would allow to remove code scattered all over the place that ignores DBG_VALUEs.
From a software engineering point of view this seems great to me as it shifts some complexity from code scattered around CodeGen into complexity in LiveDebugVariables which is a specialized pass…

Some comments:

One of the major drawbacks of our current variable tracking debug info design is that it puts lots of llvm.dbg.value instructions in the instruction stream. After the debug info BoF at the 2018 dev meeting (last week), I started to seriously consider a design where we track variable locations on the side of the instruction stream. People seemed to agree this idea was good enough to at least explore, so I put together some exploratory data and ideas.

Our current design tracks variable values with llvm.dbg.value instructions, which live in the instruction stream. This is crummy, because passes have to know to filter them. Today we have filtering iterators, but most passes have not been audited to use them, and we regularly have bugs where enabling debug info changes what the optimizer does. If we move variable tracking out of the instruction stream, we can fix this entire class of bugs by design.

Second, the current design slows down basic block iteration. My intuition says that, after inlining, we have tons of dbg.value instructions clogging up our basic blocks, slowing down our standard optimizer pipeline. Linked list iteration isn’t fast on modern architectures, so if dbg.values are an appreciable source of elements, moving it out of the list will speed up compilation and make -g less of a compile-time bear.

One thing this change will not address is the kind of use-before-def bugs we get today where dbg info unaware passes sink or hoist SSA instructions across the dbg.value that uses them. It has been suggested that we should attach the dbg.value directly to the instruction producing the value, but I think this is a mistake, because it only handles the common case, and not the special case where the value is produced far from the point of the assignment. Even if we have the special case attachment for value-producing instructions, we need new APIs to help passes decide how to update the debug info when sinking or hoisting across blocks.

Data & Motivation

I used my compile-time test case of the day for this: SemaChecking.cpp. I re-compiled opt with the attached patch, which gets the total number of Instructions after CGP, and the number of dbg intrinsics. I compiled SemaChecking.cpp to bitcode with clang -cc1 -debug-info-kind=limited -gcodeview -mllvm -instcombine-lower-dbg-declare=0 to match what Chromium uses. I think these numbers are general, though. Here are the numbers I got from my patch:

160402 codegenprepare - Total number of llvm.dbg.* instructions
296080 codegenprepare - Total number of instructions

So, 54% of all instructions are dbg.(value|declare) instructions today! Moving them out of the main BB ilist would speed up BB iteration by 2x. That’s not necessarily where we spend our time. I happen to already know that this test case does a lot of BB scanning (PR38829), so I know it will help this test case, but it may not matter if we make local dominance not scan.

In any case, here is the time to compile to bitcode for this example:
-O2 -g: 89.6s
-O2: 53.6s
-O2 -gmlt: 52.6s

-gmlt should be slower than no debug info, so clearly my system is noisy, but I think the impact of dbg.value is still clearly observable.

The next thing I noticed is that dbg.values tend to appear in packs. My theory is that most of this is from multiple levels of inlining of helpers that forward parameters. Here is a histogram of the length of dbg intrinsic runs:

dbg instructions, run length | freq
1 13086
2 6587
3 3927
4 1605
5 1643
6 1414
7 2165
8 1708
9 226
10 36
11 26
12 14
13 1750
14 70
15 224
16 3
17 154
18 5
19 1
20 2
22 209
23 688
24 705
28 4
29 1
36 1
40 2
71 1

My understanding of the histogram above is that 52% of dbg.values are in runs of length 8 or greater. You can see some spikes here at 13 and 22-24, which I think supports my theory about inlining above.

This example is just a standard -O2 large TU in clang. I can only imagine that the dbg.value runs get longer in ThinLTO, where the inliner runs more.

Design

dbg.value is meant to associate an SSA value with a label. My basic idea is that it’s better to attach a label to the instruction it precedes rather than creating a separate instruction. However, a label should stay behind if the instruction it precedes moves. For example, Instruction::moveAfter/moveBefore should implicitly reattach their variable tracking info to the next instruction. While working on https://reviews.llvm.org/D51664, I realized it would be pretty easy to use the ilist callbacks currently used for symbol table updating to implement this.

I think making the attachments describe the values of variables before the instruction is good, because in the general case every block has at least one instruction, which typically doesn’t produce a value: the terminator. The only terminator that produces a value is invoke, and we can attach any variable tracking info for it to the first (non-phi?) instruction in the normal successor.

I think, overall, this is just an internal representation shift. We may want to change our bitcode and assembly language to better align with our internal representation, but it should be functionaly equivalent.

IR syntax

This is kind of important, since it affects how developers think about these and keep them up to date. Consider the following C source:

int foo(void);
int bar(void) {
int v = foo();
v = foo();
return v;
}

I’m proposing moving from something like this:

%v1 = call i32 @foo()
call void @llvm.dbg.value(metadata i32 %v1, metadata !123, metadata !DIExpression) !dbg !456
%v2 = call i32 @foo()
call void @llvm.dbg.value(metadata i32 %v2, metadata !123, metadata !DIExpression) !dbg !456
ret i32 %v2

To something like this:

%v1 = call i32 @foo()
!dbgvalue i32 %v1, variable !123, loc !456 [, expr …]
%v2 = call i32 @foo()
!dbgvalue i32 %v2, variable !123, loc !456 [, expr …]
ret i32 %v2

I really like this!

‘expr’ would be an optional DIExpression, expressed more compactly if possible, perhaps using our inline !DIExpression parsing for now.

This also avoids the confusion that today the dbgloc attached to a dbg.value is not actually used to generate line table entries, it’s only for tracking distinct variables created from different inlined call sites.

Data structures

What we’re really trying to represent is two separate sequences. For simplicity and minimal change, I initially propose that we create a secondary linked list of some new DbgValue data structures. They would relate kind of like this:

inst1 → inst2 → inst3 → inst4
\ \ \
dv1 → dv2 → dv3 → dv4

If we were to delete inst2, dv2 would be reattached to the following instruction:

inst1 → inst3 → inst4
\ \
dv1 → dv2 → dv3 → dv4

This example is confusing me. Why wouldn’t we just delete inst2? The debug variable schedule side table wouldn’t need to be touched and would stay “dv1->dv2->dv3->dv4”. We just happen to not have an instruction around any longer that produces a value for dv2 which is exactly what I would expect after deleting inst2…

I would expect that we need operations like:

Merging multiple debug_var annotations. So that in the end we can have debug_var annotations for multiple variables on a single instruction. Things like CSE, copy coalescing, tail merging would need this.
Copying/duplicating debug_var annotations to other instructions. This would be used by inlining, loop unrolling, spilling, reloading, live range splitting…
Matthias

adrian.prantl · October 23, 2018, 4:40pm

Thanks for writing this up! I think this is definitely worth exploring.

<...>

Design
-------

dbg.value is meant to associate an SSA value with a label. My basic idea is that it's better to attach a label to the instruction it precedes rather than creating a separate instruction. However, a label should stay behind if the instruction it precedes moves. For example, Instruction::moveAfter/moveBefore should implicitly reattach their variable tracking info to the next instruction. While working on ⚙ D51664 [IR] Lazily number instructions for local dominance queries, I realized it would be pretty easy to use the ilist callbacks currently used for symbol table updating to implement this.

I think making the attachments describe the values of variables before the instruction is good, because in the general case every block has at least one instruction, which typically doesn't produce a value: the terminator. The only terminator that produces a value is invoke, and we can attach any variable tracking info for it to the first (non-phi?) instruction in the normal successor.

I think, overall, this is just an internal representation shift. We may want to change our bitcode and assembly language to better align with our internal representation, but it should be functionaly equivalent.

IR syntax
----------

This is kind of important, since it affects how developers think about these and keep them up to date. Consider the following C source:

  int foo(void);
  int bar(void) {
    int v = foo();
    v = foo();
    return v;
  }

I'm proposing moving from something like this:

  %v1 = call i32 @foo()
  call void @llvm.dbg.value(metadata i32 %v1, metadata !123, metadata !DIExpression) !dbg !456
  %v2 = call i32 @foo()
  call void @llvm.dbg.value(metadata i32 %v2, metadata !123, metadata !DIExpression) !dbg !456
  ret i32 %v2

To something like this:

  %v1 = call i32 @foo()
  !dbgvalue i32 %v1, variable !123, loc !456 [, expr ...]
  %v2 = call i32 @foo()
  !dbgvalue i32 %v2, variable !123, loc !456 [, expr ...]
  ret i32 %v2

At first glance, this example seems to contradict the definition in the Design section. If a !dbgvalue describes the debug info *before* the instruction, I would have expected the syntax to be like

%v1 = call i32 @foo(), !dbg !1
%v2 = call i32 @foo(), !dbg !2, !dbgvalue i32 %v1, variable !123, loc !456 [, expr ...]

So my question is: which instruction is the !dbgvalue attached to, and should we make this more explicit in the syntax?

'expr' would be an optional DIExpression, expressed more compactly if possible, perhaps using our inline !DIExpression parsing for now.

This also avoids the confusion that today the dbgloc attached to a dbg.value is not actually used to generate line table entries, it's only for tracking distinct variables created from different inlined call sites.

We could go one step further and only specify an optional inlinedAt field instead of a full location.

Alternatively, we could start paying attention to the !dbg location of dbg.values. When instructions are moved around by transformations, it is often very dbg.values often ambiguous where a dbg.value belongs. If we tied a dbg.value should more closely to a specific DILocation, some of that ambiguity might go away, but I couldn't come up with a consistent model that uses this approach so far.

One last thing to consider for both syntax and internal representation: it would be nice if the model naturally allowed for a future extension (39141 – Extend llvm.dbg.value to take more than one LLVM SSA value.) where one dbg.value's DIExpression can refer to more than one SSA value.

-- adrian

adrian.prantl · October 23, 2018, 4:51pm

When working on CodeGen I was also pondering why we have DBG_VALUEs in the schedule of our functions, when they are really about the schedule in our source code which isn’t necessarily the same… I also reached the conclusion that I’d rather have a side table with a schedule for the debug variables (whatever that would mean in practice).

I really like the idea presented here to attach the debug_values to the instructions producing the values!

Note that this is not necessarily the case, even if the examples made it look like it.

We still have to support constant dbg.values that need to be at a specific point in the instruction stream and don't refer to any SSA value.
For example, we can still have code like this:

  int x = foo();
  int y = 42;
  bar();
  y = x;
  baz();

that might lower into IR like

  %x = call i32 @foo()
  call void @llvm.dbg.value(i32 %x, DIVariable("x"), ...)
  call void @llvm.dbg.value(i32 42, DIVariable("x"), ...)
  call void @bar()
  call void @llvm.dbg.value(%x, DIVariable("y"), ...) ; %x is not materialized here, and we also can't hoist this.
  call void @foo()

-- adrian

adrian.prantl · October 23, 2018, 4:53pm

[resending with a bugfix s/x/y/ in my example]

We still have to support constant dbg.values that need to be at a specific point in the instruction stream and don’t refer to any SSA value.
For example, we can still have code like this:

int x = foo();
int y = 42;
bar();
y = x;
baz();

that might lower into IR like

%x = call i32 @foo()
call void @llvm.dbg.value(i32 %x, DIVariable(“x”), …)
call void @llvm.dbg.value(i32 42, DIVariable(“y”), …)
call void @bar()
call void @llvm.dbg.value(i32 %x, DIVariable(“y”), …) ; %x is not materialized here, and we also can’t hoist this.
call void @foo()

– adrian

dblaikie · October 23, 2018, 5:02pm

One of the major drawbacks of our current variable tracking debug info design is that it puts lots of llvm.dbg.value instructions in the instruction stream. After the debug info BoF at the 2018 dev meeting (last week), I started to seriously consider a design where we track variable locations on the side of the instruction stream. People seemed to agree this idea was good enough to at least explore, so I put together some exploratory data and ideas.

Our current design tracks variable values with llvm.dbg.value instructions, which live in the instruction stream. This is crummy, because passes have to know to filter them. Today we have filtering iterators, but most passes have not been audited to use them, and we regularly have bugs where enabling debug info changes what the optimizer does. If we move variable tracking out of the instruction stream, we can fix this entire class of bugs by design.

Second, the current design slows down basic block iteration. My intuition says that, after inlining, we have tons of dbg.value instructions clogging up our basic blocks, slowing down our standard optimizer pipeline. Linked list iteration isn’t fast on modern architectures, so if dbg.values are an appreciable source of elements, moving it out of the list will speed up compilation and make -g less of a compile-time bear.

One thing this change will not address is the kind of use-before-def bugs we get today where dbg info unaware passes sink or hoist SSA instructions across the dbg.value that uses them. It has been suggested that we should attach the dbg.value directly to the instruction producing the value, but I think this is a mistake, because it only handles the common case, and not the special case where the value is produced far from the point of the assignment. Even if we have the special case attachment for value-producing instructions, we need new APIs to help passes decide how to update the debug info when sinking or hoisting across blocks.

Data & Motivation

I used my compile-time test case of the day for this: SemaChecking.cpp. I re-compiled opt with the attached patch, which gets the total number of Instructions after CGP, and the number of dbg intrinsics. I compiled SemaChecking.cpp to bitcode with clang -cc1 -debug-info-kind=limited -gcodeview -mllvm -instcombine-lower-dbg-declare=0 to match what Chromium uses. I think these numbers are general, though. Here are the numbers I got from my patch:

160402 codegenprepare - Total number of llvm.dbg.* instructions
296080 codegenprepare - Total number of instructions

So, 54% of all instructions are dbg.(value|declare) instructions today! Moving them out of the main BB ilist would speed up BB iteration by 2x. That’s not necessarily where we spend our time. I happen to already know that this test case does a lot of BB scanning (PR38829), so I know it will help this test case, but it may not matter if we make local dominance not scan.

In any case, here is the time to compile to bitcode for this example:
-O2 -g: 89.6s
-O2: 53.6s
-O2 -gmlt: 52.6s

Not urgent, but before someone dives into this, might be worth doing more detailed measurements (removing some of the noise &) - such as disabling the debug info emission (and/or just running the O3 pipeline in opt specifically - without debug info generation in Clang or emission in the LLVM backend) - would give a more accurate measurement of the cost of these intrinsics in the kind of ways you’re suggesting. (conveniently, this is a bit easier to measure (because you can compare with/without, as you have here) than the bitcast stuff - which is hard to measure without actually removing them)

-gmlt should be slower than no debug info, so clearly my system is noisy, but I think the impact of dbg.value is still clearly observable.

The next thing I noticed is that dbg.values tend to appear in packs. My theory is that most of this is from multiple levels of inlining of helpers that forward parameters. Here is a histogram of the length of dbg intrinsic runs:

dbg instructions, run length | freq
1 13086
2 6587
3 3927
4 1605
5 1643
6 1414
7 2165
8 1708
9 226
10 36
11 26
12 14
13 1750
14 70
15 224
16 3
17 154
18 5
19 1
20 2
22 209
23 688
24 705
28 4
29 1
36 1
40 2
71 1

My understanding of the histogram above is that 52% of dbg.values are in runs of length 8 or greater. You can see some spikes here at 13 and 22-24, which I think supports my theory about inlining above.

This example is just a standard -O2 large TU in clang. I can only imagine that the dbg.value runs get longer in ThinLTO, where the inliner runs more.

Design

dbg.value is meant to associate an SSA value with a label. My basic idea is that it’s better to attach a label to the instruction it precedes rather than creating a separate instruction. However, a label should stay behind if the instruction it precedes moves. For example, Instruction::moveAfter/moveBefore should implicitly reattach their variable tracking info to the next instruction. While working on https://reviews.llvm.org/D51664, I realized it would be pretty easy to use the ilist callbacks currently used for symbol table updating to implement this.

I think making the attachments describe the values of variables before the instruction is good, because in the general case every block has at least one instruction, which typically doesn’t produce a value: the terminator. The only terminator that produces a value is invoke, and we can attach any variable tracking info for it to the first (non-phi?) instruction in the normal successor.

I think, overall, this is just an internal representation shift. We may want to change our bitcode and assembly language to better align with our internal representation, but it should be functionaly equivalent.

IR syntax

This is kind of important, since it affects how developers think about these and keep them up to date. Consider the following C source:

int foo(void);
int bar(void) {
int v = foo();
v = foo();
return v;
}

I’m proposing moving from something like this:

%v1 = call i32 @foo()
call void @llvm.dbg.value(metadata i32 %v1, metadata !123, metadata !DIExpression) !dbg !456
%v2 = call i32 @foo()
call void @llvm.dbg.value(metadata i32 %v2, metadata !123, metadata !DIExpression) !dbg !456
ret i32 %v2

To something like this:

%v1 = call i32 @foo()
!dbgvalue i32 %v1, variable !123, loc !456 [, expr …]
%v2 = call i32 @foo()
!dbgvalue i32 %v2, variable !123, loc !456 [, expr …]
ret i32 %v2

‘expr’ would be an optional DIExpression, expressed more compactly if possible, perhaps using our inline !DIExpression parsing for now.

This also avoids the confusion that today the dbgloc attached to a dbg.value is not actually used to generate line table entries, it’s only for tracking distinct variables created from different inlined call sites.

Data structures

What we’re really trying to represent is two separate sequences. For simplicity and minimal change, I initially propose that we create a secondary linked list of some new DbgValue data structures. They would relate kind of like this:

inst1 → inst2 → inst3 → inst4
\ \ \
dv1 → dv2 → dv3 → dv4

If we were to delete inst2, dv2 would be reattached to the following instruction:

inst1 → inst3 → inst4
\ \
dv1 → dv2 → dv3 → dv4

What’s interesting is that once dv2 and dv3 are grouped together, it’s actually a bug if we ever generate code between them. I can imagine more compact representations than a linked list, but we want to be able to efficiently join two sequences of dbgvalues when deleting or moving a single instruction, and linked lists achieve that. The list would be owned by the BasicBlock, since if a whole block is unreachable, there’s no where to put the value tracking info.

Actually doing it

LLVM currently has several unfinished migrations going on and a fair amount of technical debt. It’s not clear that this project is the top priority for me or anyone else at this moment, so even if people like this idea, don’t assume it will be implemented soon. I think similar efficiency reasoning applies to the pointee-type removal API migration that David Blaikie started.

Yeah, pretty much - though it’s nice that we can measure the likely impact of this ahead of time, unlike with the bitcast work. Easier to assess the importance, etc.

(well, I guess it’s a bit different - the thing we can measure here isn’t such a concern for pointer bitcasts - but then there’s also the concern that covers both (that it hinders (rather than just slowing down) transformations because they may not correctly ignore them))

Casts are very similar to dbg.values in this way.

However, I wanted to write it all down before letting it fade from memory. Apologies for the length, I didn’t have time to make a shorter RFC. =P

For sure - worth having it down - thanks for that!

rnk · October 23, 2018, 5:07pm

At first glance, this example seems to contradict the definition in the Design section. If a !dbgvalue describes the debug info before the instruction, I would have expected the syntax to be like

%v1 = call i32 @foo(), !dbg !1
%v2 = call i32 @foo(), !dbg !2, !dbgvalue i32 %v1, variable !123, loc !456 [, expr …]

So my question is: which instruction is the !dbgvalue attached to, and should we make this more explicit in the syntax?

Well, the syntax was meant to show that the value tracking starts immediately before the instruction it is attached to. It’s an internal implementation whether we attach the tracking to the instruction before or after the location where the variable assignment appears, the label will appear in between them. I think, for internal implementation reasons, it’s best to attach the value tracking to the next instruction after the value appears, so the dbg value notionally happens before the instruction its attached to. This makes sense because every block has at least one terminator, even if it is just unreachable.

MatzeB · October 23, 2018, 5:11pm

Maybe I am misunderstanding the proposal, but I would imagine this to be modeled similar to this:

%x = call i32 @foo() !debug-variable X0, !debug-variable Y1
call @bar()
call @foo()

now we have two tables for the X and Y debug variables:

X0 (line xxx-yyy)

Y0 - Constant 42 (line xxx-yyy)
Y1 (line yyy-zzz)

Matthias

clattner · October 24, 2018, 6:19pm

Here are the numbers I got from my patch:

160402 codegenprepare - Total number of llvm.dbg.* instructions
296080 codegenprepare - Total number of instructions

So, 54% of all instructions are dbg.(value|declare) instructions today! Moving them out of the main BB ilist would speed up BB iteration by 2x. That’s not necessarily where we spend our time. I happen to already know that this test case does a lot of BB scanning (PR38829), so I know it will help this test case, but it may not matter if we make local dominance not scan.

Whoa.

dbg.value is meant to associate an SSA value with a label. My basic idea is that it’s better to attach a label to the instruction it precedes rather than creating a separate instruction. However, a label should stay behind if the instruction it precedes moves. For example, Instruction::moveAfter/moveBefore should implicitly reattach their variable tracking info to the next instruction. While working on https://reviews.llvm.org/D51664, I realized it would be pretty easy to use the ilist callbacks currently used for symbol table updating to implement this.

To point out something you already know :-), moveFirst and friends are just helper functions for more general ilist manipulation mechanisms that move chunks of instructions.

-Chris

clattner · October 24, 2018, 7:32pm

Here are the numbers I got from my patch:

160402 codegenprepare - Total number of llvm.dbg.* instructions
296080 codegenprepare - Total number of instructions

So, 54% of all instructions are dbg.(value|declare) instructions today! Moving them out of the main BB ilist would speed up BB iteration by 2x. That’s not necessarily where we spend our time. I happen to already know that this test case does a lot of BB scanning (PR38829), so I know it will help this test case, but it may not matter if we make local dominance not scan.

Whoa.

One other random thought: how many of these llvm.dbg.* are directly adjacent to each other? It would be very simple to extend llvm.dbg.value/declare to be a list of declarations. If many of these are next to each other (or are close enough, e.g. simple casts between them) then you could keep the per-instruction design and get the vast majority of the win. This would be super incremental.

-Chris

rnk · October 24, 2018, 8:08pm

Yep, the stat I came up with to quantify that was that 52% of dbg.values come in runs of length 8 or greater for this test case. The “variadic” dbg.value idea is good, and it’s come up at past BoFs and socials. I think it captures 90% of the efficiency gains in this proposal, but doesn’t eliminate the bug class of “-g inserts dbg.values that change -O2 codegen”.

clattner · October 25, 2018, 4:43am

One other random thought: how many of these llvm.dbg.* are directly adjacent to each other? It would be very simple to extend llvm.dbg.value/declare to be a list of declarations. If many of these are next to each other (or are close enough, e.g. simple casts between them) then you could keep the per-instruction design and get the vast majority of the win. This would be super incremental.

Yep, the stat I came up with to quantify that was that 52% of dbg.values come in runs of length 8 or greater for this test case. The “variadic” dbg.value idea is good, and it’s come up at past BoFs and socials. I think it captures 90% of the efficiency gains in this proposal,

Makes sense, it would be great if someone could prototype it and see how much it helps in practice. It seems like it would be a huge (and low hanging!) win.

but doesn’t eliminate the bug class of “-g inserts dbg.values that change -O2 codegen”.

Unless you’re planning to change all of the debug intrinsics, I don’t think your proposal addresses this either.

-Chris

pogo59 · October 25, 2018, 2:13pm

The variadic llvm.dbg.value idea would basically eliminate the compile-time-performance motivation to solve it the right way. But if those same lists would ultimately be attached to real instructions, or labels, it’s a step in the right direction.

(I’ve been grousing about debug-info instructions for years. llvm.dbg.declare can be eliminated by adding one parameter to alloca. It’s lovely to see someone who understands IR way better than I do, trying to solve this.)

–paulr

pogo59 · October 25, 2018, 4:05pm

Hi Reid,

when you were looking at “runs” of dbg.value instructions, were you looking just for a series of calls, or at the operands? I seem to remember seeing long runs with identical operands, which of course would be completely redundant. IIRC this was especially bad when it came to a global init function inlining boodles of the same ctor for a long string of globals with the same type. LLVM tends not to do that but application code might well do that.

Thanks,

–paulr

Topic		Replies	Views
Prototyping a not-an-instruction dbg.value IR & Optimizations debuginfo	3	608	November 17, 2022
Notes from dbg.value coffee chat LLVM Dev List Archives	5	117	November 2, 2020
DBG_VALUE / DBG_VALUE_LIST handling when moving/cloning instructions LLVM Dev List Archives	4	96	May 13, 2021
[DebugInfo] The meaning of dbg.value positions LLVM Dev List Archives	0	97	March 1, 2019
[RFC] DebugInfo: A different way of specifying variable locations post-isel LLVM Dev List Archives	8	86	May 27, 2020

[RFC] Moving llvm.dbg.value out of the instruction stream

Data & Motivation

Design

IR syntax

Data structures

Data & Motivation

Design

IR syntax

Data structures

Actually doing it

Related Topics