[RFC] Assignment tracking: A better way of specifying variable locations in IR

OCHyams · July 6, 2022, 3:02pm

Hi all,

TL;DR The outstanding design issues have been addressed by bringing us closer to existing dbg.declare behaviours, and I think it might be useful to start uploading patches soon.

Resolved design issues

I’ve come up with a new “rule” that solves a couple of issues. It is “unless we have more information available in the form of dbg.assigns, stores are the source of truth for variable assignments”. This means that, in the absence of linked dbg.assign intrinsics, a store to an alloca that backs a variable is considered to be an assignment to that variable and thus the memory location is chosen from there. That sounds intuitive, but in reality it’s a pragmatic compromise. It could cause some assignments to be seen “too early” while stepping, if the store has been hoisted for example. To be celar, AFAICT this is not a regression from existing behaviour, and we believe it’s the best thing to do when dbg.assign intrinsics are missing. Ignoring the store could lead to incorrect variable locations and terminating an existing location there would be overly conservative.

In my previous message I had already come to the conclusion that this was a good thing to do in the analysis pass. I now think that it should be a general rule for the entire pipeline. Ultimately, this brings us closer to dbg.declare behaviour as a safe fallback. When we use dbg.declares, stores are the only source of truth for variable assignments throughout compilation.

Example

Here is a concrete example where this rule helps in the middle end:

int useValue(int);
void useAddress(int *p) { useValue(*p); }
int fun(int a, int b) {
  int classic;
  if (a)
    classic = a;
  else
    classic = b;
  useAddress(&classic);
  return classic;
}

1. InstCombine sinks/merges the stores in if.then and if.else in the common successor.
2. SimplifyCFG eliminates the "empty" blocks which each contain a `dbg.assign` linked to the now sunk store.
3. useAddress is inlined; `classic` is no longer escaped.
4. sroa/mem2reg runs and eliminates `classic`'s alloca.

Before: The intrinsics describing the assignment get deleted because they’re in otherwise empty blocks that are removed, resulting in no variable locations for classic. If classic were initialised in its declaration then we’d only have a variable location for that assignment and it would incorrectly cover the entire function.
With the new rule: The intrinsics are deleted but mem2reg can interpret the store and insert a dbg.value (/unlinked dbg.assign) describing the assignment. This is possible because the backing alloca has an associated dbg.assign containing the variable information.

Moving forward

I’m working on de-prototyping my existing implementation and the work is not complete. However, I think it could be useful to start uploading patches with this feature hidden behind a flag so that we can start integrating and testing ASAP, and so that the finished product isn’t one massive code review. Once the core of the implementation has landed I’ll be able to continue development upstream.

I imagine the work can be split up into roughly these parts:

1. Documentation
2. Core - Add the intrinsic and metadata, the attachment-to-instruction tracking, DIBuilder changes, etc.
3. Front end - Emit the new intrinsics.
4. Middle end - Make changes to passes to correctly handle the new metadata.
5. Analysis pass - Add the analysis that interprets the metadata to create variable location information for CodeGen.
6. Integrate the analysis results with CodeGen (initially just SelectionDAG, I imagine).

Feedback and opinions most welcome.

Thanks,
Orlando

Topic		Replies	Views
[RFC] DebugInfo: A different way of specifying variable locations post-isel LLVM Dev List Archives	8	157	May 27, 2020
[DWARF] .debug_loc tracking of variable location LLVM Dev List Archives	1	155	September 25, 2018
Variable tracking in debug info LLVM Dev List Archives	2	129	March 21, 2018
RFC: Unify debug and optimized variable locations with llvm.dbg.addr [was: DW_OP_LLVM_memory] LLVM Dev List Archives	10	218	September 11, 2017
Notes from dbg.value coffee chat LLVM Dev List Archives	5	161	November 2, 2020

[RFC] Assignment tracking: A better way of specifying variable locations in IR

Resolved design issues

Example

Moving forward

Related topics