At ByteDance, DWP is widely used, and we have encountered many cases of DWP overflow. During debug info merging, the portion that overflows is often non-deterministic. However, ideally we would like to discard only the parts that are not currently needed by the user.
For example, when debugging a crashing program, if the required debug information remains intact in the DWP, debugging proceeds normally. Conversely, if that specific information happens to be the portion discarded due to overflow, the current debugging session may become unusable.
Proposal
Inline debug information is often a major contributor to debug info bloat. We propose adding a Clang option, e.g. -gminimize-inline-scopes=<path>, which would reduce DW_TAG_inlined_subroutine contents under the specified path, similar in spirit to -fsplit-dwarf-inlining. Under that path, only skeletal information (without parameters, variables, etc.) would be retained. This would explicitly shrink debug information that is “currently unneeded” without affecting symbolization, thereby helping to mitigate the non-determinism caused by DWP overflow discards.
Effect
We evaluated this on an internal project. When applying the option to the libstdc++ path, the size of debug_info.dwo was reduced by about 15%.
This is /unreliable/ that the specific thing that’s overflowed is/isn’t the thing you need when debugging - but the behavior of the tools isn’t literally non-deterministic, right? (like the bits that end up in the dwp are identical for each execution of the dwp generating tool given the same order of inputs, etc?)
Beyond that - what sort of overflow are you having - .debug_info section is overflowing 4GB?
I’d really be inclined to encourage/push towards more compact but lossless changes to debug info, but I acknowledge that may not be possible.
Are you already using type units? Have you considered using/implementing the DWARF64 encoding for the .debug_cu_index proposed in DWARF6? DWARF Issue: .debug_{c,t}u_index missing/incomplete DWARF64 support (I understand that growing the debug info has other problems/isn’t ideal)
Workaround to cu_index overflow was added a while back to trunk LLVM and LLDB. It worked well at Meta.
The next one that hit was .debug_str_offset thought.