[RFC] Assignment tracking: A better way of specifying variable locations in IR

Forgive me if the answer to this is obvious, but how does this relate to llvm.dbg.addr?

Hi @dmlloyd, that’s a good question. The llvm.dbg.assign intrinsic is similar to llvm.dbg.addr in that both specify a memory location for a variable and both are control-flow dependant. However unlike llvm.dbg.addr, llvm.dbg.assign uses DIAssignID metadata attachments in order to link (or point to) instructions that are performing the assignment that the intrinsic describes. This gives us more information about the relationship between the unoptimised and optimised program which we use in the dataflow.

More abstractly, llvm.dbg.addr specifies that the variable is live in memory at that point in the IR, whereas llvm.dbg.assign specifies there was an assignment at that point. Does that help?

Thanks Orlando! Your example

call @llvm.dbg.assign(metadata i32 1, !myvar, !DIExpression(), metadata !1, metadata i32* %a)

answers both of my questions. It wasn’t clear to me that llvm.dbg.assign would accept constant values as the value parameter. Since it does, that also explains why it may not be a good idea to attach them to other instructions, since there may not be one, such as in the constant or the undef case.

Out of curiosity, would LLVM IR allow us to use a MetadataFromValue(StoreInst) as the first parameter? Maybe by using an assembler syntax like:

%token = store i32 0, i32* %var
call void @llvm.dbg.assign(metadata store_token %token, ...)

?

Ah I see, it’s starting to click. :slight_smile: Thanks!

Out of curiosity, would LLVM IR allow us to use a MetadataFromValue(StoreInst) as the first parameter? Maybe by using an assembler syntax like:

%token = store i32 0, i32* %var
call void @llvm.dbg.assign(metadata store_token %token, ...)

I did consider attempting to use tokens but decided to use metadata for the prototype since it was less work and more flexible (I wasn’t sure exactly how everything was going to work).

One of my original reasons for not using tokens is that llvm.dbg.assign intrinsics reference more than just store instructions: alloca, store, mem* intrinsics. I’m still not sure whether to consider user calls to the mem* functions as assignments - if we do, then llvm.dbg.assign intrinsics will also need to reference those calls (and those functions return a value). However looking back, I don’t think this stops us using tokens. Memory intrinsics don’t return a value so we could use tokens for those, and allocas (and user mem* function calls) could just be referenced via their value.

In any case, using the metadata approach instead of tokens has these two benefits:

  1. Using metadata requires fewer code changes (I think). Stores and mem* intrinsics are not replaced using RAUW. In order to keep @llvm.dbg.assigns up to date we’d need to add many RAUW calls.

  2. Tokens introduce some admin overhead because an llvm.dbg.assign can refer to multiple instructions (like in the LICM example), which means we need to manage a list of references (possibly using DIArgList).

It may be worth dealing with these issues in order to use tokens for the real implementation. I think using tokens would make the IR more readable, and referencing instructions in that way feels more familiar. I’m not sure that those reasons alone are enough though. What do you think?

First, I don’t think it makes sense to involve token types in this. Token types have weird restrictions on IR movement. Store instructions claim to have type void, and we don’t want to change that:
https://github.com/llvm/llvm-project/blob/6e23cd2bf073ab5eb5a103b935af54046d007c52/llvm/lib/IR/Instructions.cpp#L1546

memcpy intrinsic calls are also values of type void.

We could potentially allow metadata to reference instructions of type void. That could potentially remove the need for metadata ids on the side. But, we would need to invent textual IR syntax for this, since void values do not receive names or numbers in an LLVM symbol table.

Regarding RAUW, my understanding is that metadata uses are not replaced by regular RAUW operations anyway. We should ask ourselves, what happens when someone calls eraseFromParent on the store to delete it? Can we live with that behavior? If so, maybe we can do it. I think ValueAsMetadata objects are effectively weak pointers, so maybe this could work.

Changing the textual IR of LLVM is pretty disruptive, despite our disclaimer of its stability. Here are some options:

  1. Just number all void instructions with the usual % sigil anyway. Drastically changes output. Potentially less human readable.
  2. Number only void values with live uses from metadata. Less disruptive, but somewhat non-local.
  3. Use some new sigil, put in the RHS, or something like that.

The more I think about it, the more 2 seems reasonable. An instruction must have a number if ValueAsMetadata exists for it. Also, I think these in-memory data structures are closer to what we want: we really just want to refer to the assigning instruction with as little indirection as possible.

That makes sense, avoiding tokens sounds sensible. Option 2 certainly looks like a good avenue to explore if we go down this route.

What I was trying to get at regarding RAUW is that metadata attachments on instructions are already preserved by the compiler in various circumstances so little work is required to preserve the link between the llvm.dbg.assign and the store(s). Whereas if llvm.dbg.assign referenced stores directly (numbered void values, tokens, etc) we’d need to do additional work in the compiler to preserve the relationship between the instruction and llvm.dbg.assign.

e.g., when a new store is created to replace an existing store it’s likely metadata is already being copied to that new store - the llvm.dbg.assign is linked to that new store automatically. If the llvm.dbg.assign references the store via an explicit Use then we’d need to add a call to RAUW to preserve the relationship (and we’d have to do this all over the compiler).

Maybe it isn’t a big problem? I just think it’s worth noting that the (existing) metadata attachment approach seems be less invasive in terms of changes to the compiler.

It’s also easier to refer to multiple instructions from one llvm.dbg.assign because multiple instructions can have the same metadata attachment.

What benefits do we get by modelling the relationship via an explicit Use (numbered void values, tokens, etc) rather than the current metadata attachment approach?


Regarding RAUW, my understanding is that metadata uses are not replaced by regular RAUW operations anyway.

AFAIK metadata uses are replaced by RAUW by default (code).

What I was trying to get at regarding RAUW is that metadata attachments on instructions are already preserved by the compiler in various circumstances so little work is required to preserve the link between the llvm.dbg.assign and the store(s).

Right, I see, the transforms to keep in mind here are duplications, such as inlining, unswitching, unrolling, etc. Most of these probably only remap non-void values. That’s pretty compelling.

What benefits do we get by modelling the relationship via an explicit Use (numbered void values, tokens, etc) rather than the current metadata attachment approach?

I think there is a simplicity benefit to avoiding indirection when possible. Consider how LLVM IR SSA uses direct pointers instead of some indirection through a side table.

The other benefit is a minor efficiency gain that we do not need another heap allocation for every tracked source level assignment.

We also often overlook the fact that metadata allocations live forever and are not garbage collected, whereas IR instructions have clear ownership. That’s already it’s own problem though, and it may deserve its own solution.


In the end, I think you should go with your plan. Adding metadata uses of void values may be premature optimization. It seems better to make forward progress first and optimize later.

But, we really do need to come back and optimize the memory usage of the compiler’s debug info representation. We’ve done a lot of work on it, but I think there’s still room for improvement.

I think what I’m saying is that, we should dig into this more in the future, and try to come up with some principles to help us decide when debug info is inaccurate (allows the user to see a program state not possible in the source) but still acceptable.

I just wanted to let you know that I haven’t forgotten this. I’m in the process of gathering data to help us answer this question more empirically. Unfortunately I haven’t finished putting everything together and I’m going to be out of office next week.

In the mean time, if anyone has any thoughts on the question “What variable values should - or can - we promise to be accurate?”, I’d love to hear them.

Thanks for your patience,
Orlando

Hi @rnk,

Adding metadata uses of void values may be premature optimization. It seems better to make forward progress first and optimize later.

SGTM - I hadn’t considered the performance benefits we’d get by using that alternative approach. Assignment tracking definitely needs looking at from a performance perspective, which I’ve not had the chance to do yet.

To go back to your earlier point:

What kind of assignment would clang produce for these assignments? What local variable are we updating, what goes on the left hand side? DSE fires here today, and the user can observe stale values in the debugger.

I think it’s reasonable to not try to present accurate information in both cases you and I have highlighted:

struct Bar { int b; };
Bar *wash(Bar *p) { return p; }
void foo() {
  Bar o;
  wash(&o)->b = 1; // can DSE
  wash(&o)->b = 2;
}
__attribute__((__optnone__)) void step() {}
void useAddress(int *n);
 
void fun() {
  int local = 5;
  int* local_ptr = &local;
 
  step();
 
  *local_ptr = 6; // causes local = 5 to be eliminated
 
  step();
 
  useAddress(&local);
}

Both of these examples have indirect stores to an alloca which backs a variable. These stores don’t get tagged (the have no DIAssignID metadata, and no dbg.assign records their positions). I’m currently of the opinion that the back end should treat these stores as assignments to that local (i.e. we should use the memory location for the variables there).

The result is that the locations generated for your example looks exactly the same with and without the prototype (DSE is visible). For my example, we track the initial store and acknowledge the second (DSE not visible) - this is an improvement over turning off LowerDbgDeclare (DSE visible), and results in the same values being available (from different locations) with LowerDbgDeclare turned on.

I’d like to clarify what values we promise to be accurate, and I think our promise should focus on assignments where the LHS is a local variable or a field of one.

I agree and think that promise sounds reasonable. The current design, including what I mentioned above, may cause inaccurate locations when an indirect assignment to a local:

  1. is eliminated - the store wasn’t tracked; there’s no dbg.assign to mark that an assignment happened.
  2. is partially eliminated - the store wasn’t tracked so we aren’t able to “undef” the eliminated part.
  3. is moved - the assignment will be visible from where it occurs in the optimized program (no attempt to match to the unoptimized program position, since the assignment hasn’t been tracked).

I believe the promise graciously covers these cases, and I cannot currently think of any further situations that result in inaccurate locations.

It’s less clear to me how to incorporate the fact that inlined assignments are tracked, but only from the time of inlining. I.e. some specific indirect assignments are tracked. At the moment I feel that tracking inlined stores like this is useful, but I think we can decide on that later when we’re closer to a shippable implementation and have more experimental data.

As an aside, I tried comparing how GCC (12) handles variables in memory to clang with and without assignment tracking, but for the few examples I looked at it appears clang is more aggressive with DSE than GCC so a direct comparison doesn’t tell us much.

I’m now going to start attempting to solve the remaining design limitations.

Hi all,

TL;DR The outstanding design issues have been addressed by bringing us closer to existing dbg.declare behaviours, and I think it might be useful to start uploading patches soon.

Resolved design issues

I’ve come up with a new “rule” that solves a couple of issues. It is “unless we have more information available in the form of dbg.assigns, stores are the source of truth for variable assignments”. This means that, in the absence of linked dbg.assign intrinsics, a store to an alloca that backs a variable is considered to be an assignment to that variable and thus the memory location is chosen from there. That sounds intuitive, but in reality it’s a pragmatic compromise. It could cause some assignments to be seen “too early” while stepping, if the store has been hoisted for example. To be celar, AFAICT this is not a regression from existing behaviour, and we believe it’s the best thing to do when dbg.assign intrinsics are missing. Ignoring the store could lead to incorrect variable locations and terminating an existing location there would be overly conservative.

In my previous message I had already come to the conclusion that this was a good thing to do in the analysis pass. I now think that it should be a general rule for the entire pipeline. Ultimately, this brings us closer to dbg.declare behaviour as a safe fallback. When we use dbg.declares, stores are the only source of truth for variable assignments throughout compilation.

Example

Here is a concrete example where this rule helps in the middle end:

int useValue(int);
void useAddress(int *p) { useValue(*p); }
int fun(int a, int b) {
  int classic;
  if (a)
    classic = a;
  else
    classic = b;
  useAddress(&classic);
  return classic;
}
1. InstCombine sinks/merges the stores in if.then and if.else in the common successor.
2. SimplifyCFG eliminates the "empty" blocks which each contain a `dbg.assign` linked to the now sunk store.
3. useAddress is inlined; `classic` is no longer escaped.
4. sroa/mem2reg runs and eliminates `classic`'s alloca.

Before: The intrinsics describing the assignment get deleted because they’re in otherwise empty blocks that are removed, resulting in no variable locations for classic. If classic were initialised in its declaration then we’d only have a variable location for that assignment and it would incorrectly cover the entire function.
With the new rule: The intrinsics are deleted but mem2reg can interpret the store and insert a dbg.value (/unlinked dbg.assign) describing the assignment. This is possible because the backing alloca has an associated dbg.assign containing the variable information.

Moving forward

I’m working on de-prototyping my existing implementation and the work is not complete. However, I think it could be useful to start uploading patches with this feature hidden behind a flag so that we can start integrating and testing ASAP, and so that the finished product isn’t one massive code review. Once the core of the implementation has landed I’ll be able to continue development upstream.

I imagine the work can be split up into roughly these parts:

1. Documentation
2. Core - Add the intrinsic and metadata, the attachment-to-instruction tracking, DIBuilder changes, etc.
3. Front end - Emit the new intrinsics.
4. Middle end - Make changes to passes to correctly handle the new metadata.
5. Analysis pass - Add the analysis that interprets the metadata to create variable location information for CodeGen.
6. Integrate the analysis results with CodeGen (initially just SelectionDAG, I imagine).

Feedback and opinions most welcome.

Thanks,
Orlando

1 Like

Saying a result is visible “too early” implies that we’d rather pretend the assignment hadn’t happened yet. In my view, the debug info should describe the machine code as it is, to maximize its truthfulness, and let the debugger and/or user contend with the messiness of the mapping from object to source as best they can. This is a broader discussion, though, and more appropriate to have in its own thread.

In short, I think looking at stores to/through an alloca is exactly the right thing to do.

I remember looking at how dbg.declare works once, and thinking, if we added one metadata parameter to alloca we could eliminate dbg.declare. In which case, those stores would be all we’d have, and all we’d need. (This message brought to you by EDICT, Eliminate Debug Instructions in Code, Totally. :slight_smile:)

What happens if a variable changes location over its lifetime within a function? (This can theoretically happen with Java for example where the lifetime of a local variable might not correspond to anything that a normal human would call a “lexical scope”…)

Hi @pogo59,

Saying a result is visible “too early” implies that we’d rather pretend the assignment hadn’t happened yet.

That is how I feel currently - that view is mostly dependent on the stepping behaviour around the hoisted assignment. I’d definitely be interested in discussing this further though, as ever I don’t think there’s a clear “correct” approach - perhaps striving for “least confusing” would be a good goal.

In my view, the debug info should describe the machine code as it is, to maximize its truthfulness, and let the debugger and/or user contend with the messiness of the mapping from object to source as best they can. This is a broader discussion, though, and more appropriate to have in its own thread.

SGTM to discuss separately or later - the behaviour is something we should be able to easily change in the “analysis pass”.

In short, I think looking at stores to/through an alloca is exactly the right thing to do.

Great!

Eliminate Debug Instructions in Code

One day :slight_smile:


Hi @dmlloyd,

What happens if a variable changes location over its lifetime within a function? (This can theoretically happen with Java for example where the lifetime of a local variable might not correspond to anything that a normal human would call a “lexical scope”…)

To be sure that I understand the question, are you asking how (/if) we would handle varaibles that have memory locations / backing storage that changes through its lifetime?

This is certainly not something I’ve tested, and I’d bet that it would trip up the prototype in several places. I don’t think there’s anything in our “design” that would prevent this from working properly though, with a bit of extra care and bookkeeping. Do you have any small source or IR (or preferably both) examples you can share?


Thanks both for your comments.

Hi all, I’ve added a patch set starting at ⚙ D132220 [Assignment Tracking][1/*] Add initial docs for Assignment Tracking which adds the “front end-ish and core llvm” parts of Assignment Tracking - the middle end and back end parts to follow come soon.

I would appreciate any reviews and thoughts on the patches. There are still a few rough edges but I’m hoping to get feedback on the design and implementation from the upstream community, so that we can work through any issues together.

[Assignment Tracking][1/*] Add initial docs for Assignment Tracking
[Assignment Tracking][2/*] Add flags to enable Assignment Tracking
[Assignment Tracking][3/*] Add DIAssignID metadata boilerplate
[Assignment Tracking][4/*] Add llvm.dbg.assign intrinsic boilerplate
[Assignment Tracking][5/*] Add core infrastructure for instruction reference
[Assignment Tracking][6/*] Add trackAssignments function
[Assignment Tracking][7/*] Add assignment tracking functionality to clang

Patches 5 and 6 are the more interesting ones to look at.

Kind regards,
Orlando

1 Like

Oh, and the final patch (⚙ D132226 [Assignment Tracking][7/*] Add assignment tracking functionality to clang) contains a C++ to IR test which may be useful to look at to get an idea of what this looks like in practice.

sitrep: patches numbered 8 through to 24 in the same stack are the middle-end changes required for assignment tracking.

There are a few patches with trivial changes in that range, e.g. D133311.

The following are probably the more interesting ones to look at:

⚙ D133291 [Assignment Tracking][8/*] Add DIAssignID merging utilities,
⚙ D133292 [Assignment Tracking][9/*] Don't drop DIAssignID in dropUnknownNonDebugMetadata,
⚙ D133295 [Assignment Tracking][12/*] Account for assignment tracking in mem2reg,
⚙ D133296 [Assignment Tracking][13/*] Account for assignment tracking in SROA,
⚙ D133310 [Assignment Tracking][15/*] Account for assignment tracking in simplifycfg,
⚙ D133318 [Assignment Tracking][21/*] Account for assignment tracking in inliner,

As mentioned in ⚙ D133321 [Assignment Tracking][24/*] Always RemoveRedundantDbgInstrs in instcombine in assignment tracking builds, when building CTMark’s tramp3d-v4 (with -emit-llvm and RelWithDebInfo flags) with this patch stack the peak memory consumption increases by just under 1.5% when assignment tracking is enabled, and the run time difference (of the compiler) is negligible.


The next and final patch set will be lowering assignment tracking metadata to standard MIR debug instructions.

I’ve omitted a few patches that are redundant if we can solve the issue I mention here Auto-undef debug uses of a deleted value

sitrep: the final patches to make Things Work are now on phabricator. The ones that are not just NFC / refactoring in preparation are:

https://reviews.llvm.org/D136320

https://reviews.llvm.org/D136321

https://reviews.llvm.org/D136325

https://reviews.llvm.org/D136331

https://reviews.llvm.org/D136335


Next steps

I plan to start landing the NFC and refactoring patches, followed by slowly landing the accepted assignment tracking patches (assignment tracking is disabled by default and must explicitly be enabled using -Xclang -fexperimental-assignment-tracking).

There’s still work to do (see TODO list in D132220) and likely some ironing out of bugs.

@OCHyams, thanks for all your work on this so far! :slightly_smiling_face: Are those final patches ready for review? There aren’t any reviewers set, so I wasn’t sure…

Hi @jryans, yes I they are ready for review, thanks for checking. I wasn’t sure who to add to review these - any and all comments would be very welcome!