This RFC is to propose an approach to address issues of function-local entities emission and implement support of static variables and types scoped in a lexical (bracketed) block. I’ve made several attempts to address the problem  before I came up with the proposed solution, so put the history of this discussion into the RFC to have all relevant information to progress with the proposal.
This changes debug metadata: all local entities are now tracked by
retainedNodesfield. Previously they were in different fields of
DICompileUnitcan no longer track any local entity in its fields with the exception of
retainedTypes– it has to persist because local scopes can be fully optimized out. This change allows removing several FIXMEs in the code. It also unifies and simplifies retrieving information about local entities scoped in a particular
DWARF emission for all (not just function-local) global variables, types and imported declarations get moved from
DwarfDebug::endModule(). For function-local entities we have to do this to place them into a proper subprogram tree (abstract or concrete): only by
DwarfDebug::endModule()it is resolved whether a particular subprogram has an abstract sub-tree. Non function-local (i.e. global) entities handling are moved to endModule() mostly for the sake of unification (to handle all the entities in a single place). Global imported entities, however, also depend on all subprograms being created since they can refer to a function.
Besides unification and enabling function-local debug entities emission, (2) addressed a design issue of the current approach: passes that run after
DwarfDebug::beginModule()could change the IR making emitted debug info incorrect (for instance, it was the case for NVPTX generic-to-nvvm lowering).
(2) by itself this is an NFCI-like change; it affects only the order of emitted debug entities.
To emit function-local entities
DISubprgram’s retained nodes and creates a mapping between a lexical scope and function-local entities it contains. Later while visiting scopes, this mapping is used to collect function-local nodes each CU must emit. Physically they are
SmallVectors of nodes in every CU object. When emitting in
DwarfDebug::endModule()the same algorithm is used for both function-local and global entities, and the previously collected lists are used as additional source of entities to emit.
DwarfCompileUnit::getOrCreateContext()can now handle
DILexicalBlockscopes to emit entities scoped in a lexical block:
- it chooses an abstract subprogram or lexical block DIE if available (assuming all abstract entities get created by the time we emit function-local entities),
- it chooses an existing concrete out-of-line subprogram or a lexical block DIE if there is no abstract counterpart and if available,
- it falls back to the most close existing parent DIE if there is no DIE corresponding to the given local scope.
I put to review 7 patches, one of which is a review-only and should be squashed with the subsequent one before commit. The patches are:
retainedNodesto extend them for the purpose of tracking other function-local declarations.
Review only. Move imported entities emission to
DwarfDebug::endModule(). I have to extract it from the next patch because it significantly affects tests (more than 6k lines) and reviewing the next patch might be problematic. But it’s not possible to commit them separately, because this patch isn’t fully complete and causes issues with tests.
Does the most of the work of changing debug metadata and fixes function-local imported entities emission. Fixes .
Move types emission to
Implement support for local types scoped in a lexical block.
Move global variables emission to
Implement support of local static variables (globals in terms of LLVM IR) scoped in a lexical block. Fixes [.
Patches 2, 4 and 6, and 3, 5 and 7 are split from each other to separate test changes due to debug info reordering from test changes due to functional changes.
Adding ‘retainedNodes’ to lexical blocks enclosing them (not just to a subprogram) was discussed in  as an improvement for the patch set. It would make the implementation simpler and cleaner and allow skipping/deleting scopes/entities that were optimized away. The first show stopper for this idea I’ve encountered is a local type (and I haven’t thoroughly investigated if there are others). If the enclosing lexical scope is optimized out, the information of the type may be lost, despite there being a reference to it, or emitted incorrectly (in a wrong parent entity, in a wrong CU, etc). My attempts to hack around that issue ended up with a 23% compilation time increase on CTMark and the alternative price is memory footprint which doesn’t look good to me either.
The work on the patches started about a year ago, but previous attempts didn’t account for all the issues and introduced problems with split-dwarf . Finally, before coming up with the current set of patches I posted . The patches in essence implemented the same approach I described in this RFC, but they were organized differently and they were harder to review. The current patch set is more elaborated and better organized work which addressed previous concerns.
This RFC is actually a call for action. The comments about the design are appreciated, but if it looks ok, I encourage the reader to assist me by reviewing the patches.
 ⚙ D113741 [RFC][DwarfDebug][AsmPrinter] Support emitting function-local declaration for a lexical block
 ⚙ D125693 [DebugInfo] Support types, imports and static locals declared in a lexical block (3/5)
 Debug info for imported declarations sometimes reference empty DIEs · Issue #51501 · llvm/llvm-project · GitHub
 Debug info not generated correctly for function static variables · Issue #19612 · llvm/llvm-project · GitHub
 Orphaned DWARF for static local of inlined func · Issue #29985 · llvm/llvm-project · GitHub
 ⚙ D109703 [DebugInfo] Fix scope for local static variables