This RFC is to propose an approach to address issues of function-local entities emission and implement support of static variables and types scoped in a lexical (bracketed) block. I’ve made several attempts to address the problem [0][1] before I came up with the proposed solution, so put the history of this discussion into the RFC to have all relevant information to progress with the proposal.
The approach
-
This changes debug metadata: all local entities are now tracked by
DISubprogram
’sretainedNodes
field. Previously they were in different fields ofDICompileUnit
.DICompileUnit
can no longer track any local entity in its fields with the exception ofretainedTypes
– it has to persist because local scopes can be fully optimized out. This change allows removing several FIXMEs in the code. It also unifies and simplifies retrieving information about local entities scoped in a particularDISubprogram
. -
DWARF emission for all (not just function-local) global variables, types and imported declarations get moved from
DwarfDebug::beginModule()
toDwarfDebug::endModule()
. For function-local entities we have to do this to place them into a proper subprogram tree (abstract or concrete): only byDwarfDebug::endModule()
it is resolved whether a particular subprogram has an abstract sub-tree. Non function-local (i.e. global) entities handling are moved to endModule() mostly for the sake of unification (to handle all the entities in a single place). Global imported entities, however, also depend on all subprograms being created since they can refer to a function.
Besides unification and enabling function-local debug entities emission, (2) addressed a design issue of the current approach: passes that run afterDwarfDebug::beginModule()
could change the IR making emitted debug info incorrect (for instance, it was the case for NVPTX generic-to-nvvm lowering).
(2) by itself this is an NFCI-like change; it affects only the order of emitted debug entities. -
To emit function-local entities
DwarfDebug::endFunctionImpl()
parsesDISubprgram
’s retained nodes and creates a mapping between a lexical scope and function-local entities it contains. Later while visiting scopes, this mapping is used to collect function-local nodes each CU must emit. Physically they areSmallVector
s of nodes in every CU object. When emitting inDwarfDebug::endModule()
the same algorithm is used for both function-local and global entities, and the previously collected lists are used as additional source of entities to emit. -
DwarfCompileUnit::getOrCreateContext()
can now handleDILexicalBlock
scopes to emit entities scoped in a lexical block:
- it chooses an abstract subprogram or lexical block DIE if available (assuming all abstract entities get created by the time we emit function-local entities),
- it chooses an existing concrete out-of-line subprogram or a lexical block DIE if there is no abstract counterpart and if available,
- it falls back to the most close existing parent DIE if there is no DIE corresponding to the given local scope.
The patchset
I put to review 7 patches, one of which is a review-only and should be squashed with the subsequent one before commit. The patches are:
-
https://reviews.llvm.org/D143984
NFC. SimplifyDISubprogram
’sretainedNodes
to extend them for the purpose of tracking other function-local declarations. -
https://reviews.llvm.org/D143985
Review only. Move imported entities emission toDwarfDebug::endModule()
. I have to extract it from the next patch because it significantly affects tests (more than 6k lines) and reviewing the next patch might be problematic. But it’s not possible to commit them separately, because this patch isn’t fully complete and causes issues with tests. -
https://reviews.llvm.org/D144004
Does the most of the work of changing debug metadata and fixes function-local imported entities emission. Fixes [2]. -
https://reviews.llvm.org/D144005
Move types emission toDwarfDebug::endModule()
. -
https://reviews.llvm.org/D144006
Implement support for local types scoped in a lexical block. -
https://reviews.llvm.org/D144007
Move global variables emission toDwarfDebug::endModule()
. -
https://reviews.llvm.org/D144008
Implement support of local static variables (globals in terms of LLVM IR) scoped in a lexical block. Fixes [3][[4].
Patches 2, 4 and 6, and 3, 5 and 7 are split from each other to separate test changes due to debug info reordering from test changes due to functional changes.
Discussion
Adding ‘retainedNodes’ to lexical blocks enclosing them (not just to a subprogram) was discussed in [1] as an improvement for the patch set. It would make the implementation simpler and cleaner and allow skipping/deleting scopes/entities that were optimized away. The first show stopper for this idea I’ve encountered is a local type (and I haven’t thoroughly investigated if there are others). If the enclosing lexical scope is optimized out, the information of the type may be lost, despite there being a reference to it, or emitted incorrectly (in a wrong parent entity, in a wrong CU, etc). My attempts to hack around that issue ended up with a 23% compilation time increase on CTMark and the alternative price is memory footprint which doesn’t look good to me either.
Historical perspective
The work on the patches started about a year ago, but previous attempts didn’t account for all the issues and introduced problems with split-dwarf [0][5]. Finally, before coming up with the current set of patches I posted [1]. The patches in essence implemented the same approach I described in this RFC, but they were organized differently and they were harder to review. The current patch set is more elaborated and better organized work which addressed previous concerns.
Conclusion
This RFC is actually a call for action. The comments about the design are appreciated, but if it looks ok, I encourage the reader to assist me by reviewing the patches.
[0] ⚙ D113741 [RFC][DwarfDebug][AsmPrinter] Support emitting function-local declaration for a lexical block
[1] ⚙ D125693 [DebugInfo] Support types, imports and static locals declared in a lexical block (3/5)
[2] Debug info for imported declarations sometimes reference empty DIEs · Issue #51501 · llvm/llvm-project · GitHub
[3] Debug info not generated correctly for function static variables · Issue #19612 · llvm/llvm-project · GitHub
[4] Orphaned DWARF for static local of inlined func · Issue #29985 · llvm/llvm-project · GitHub
[5] ⚙ D109703 [DebugInfo] Fix scope for local static variables