[RFC] [DebugInfo] Selectively generate template parameters in the skeleton CU

TL;DR: Only generate template parameters in skeleton units for template functions/types whose names have been simplified. Please consider if there are any issues with this approach.

Background

In an internal project using -fsplit-dwarf-inlining and -fdebug-info-for-profiling, we observed that skeleton units contained a lot of template parameters. These details are intended for symbolizers to reconstruct names simplified by -gsimple-template-names, which was not enabled in this project. Consequently, these template parameters became entirely redundant.

The issue stems from LLVM’s current approach: when generating debug info for skeleton units, it unconditionally emits all template parameters. To optimize debug info size, the emission should be conditional — providing parameters only for template types/functions whose names have actually been simplified.

Proposal

Introduce a metadata flag for -gsimple-template-names to mark DISubprograms/DICompositeTypes whose names have been simplified. This enables selective generation of template parameters in skeleton CUs during the backend, thereby optimizing debug information.
I implemented the proposal in this draft PR: [DebugInfo] Introduce metadata flag for -gsimple-template-names and optimize template params in skeleton CU. by Sockke · Pull Request #174904 · llvm/llvm-project · GitHub

Effect

I conducted tests in an internal project based on LLVM 16. The build was configured with options like -fsplit-dwarf-inlining, -fdebug-info-for-profiling and -O3, but without -gsimple-template-names.

In LTO builds:

Binary File Size Reduction % Reduction
.debug_info -441Mi -29.1%
.debug_str -204Mi -26.9%
total -651Mi -10.4%

In non-LTO builds:

Binary File Size Reduction % Reduction
.debug_info -123Mi -3.8%
.debug_str -40.8Mi -6.1%
total -172Mi -1.7%

Meanwhile, I also verified the effect when -gsimple-template-names is enabled.
In LTO builds:

Binary File Size Reduction % Reduction
.debug_info -59.5Mi -3.9%
.debug_str -15.2Mi -2.4%
total -75Mi -1.3%

In non-LTO builds:

Binary File Size Reduction % Reduction
.debug_info -14.8Mi -0.4%
.debug_str -2.46Mi -0.4%
total -18.5Mi -0.2%
2 Likes

cc @dblaikie @ayermolo

cc @Michael137

Seems pretty fine to me - but certainly would love to hear from other folks.

2 Likes

Also don’t see an issue with this. I’m not too familiar with LLDB’s split-dwarf support but presumably it would use mostly debug-info from the non-skeleton units? The LLDB test-suite runs with split-dwarf on the Linux pre-commit CI. So we should see if something is badly broken by this.

The -debug-forward-template-params CC1 option is enabled for SCE tuning by default. So you might want to check with Sony (CC @jmorse @pogo59) whether stripping the template parameters in split-dwarf works for their debugger.

Re. metadata flag: I’ve been wanting a DWARF extension that tells us whether a CU was compiled with simplified template names, because LLDB currently does some guesswork to decide how it should look up types. So the metadata flag proposed here would give us an option to do that at some point. (though not sure how useful it’d ultimately be because we’d need to support binaries generated before it’s introduction).

3 Likes

AH IC. Seems fine.

1 Like

This is the list of patches that have been landed:
(1/3): #175130
(2/3): #175708
(3/3): #175879

Thanks for everyone’s attention and review.

2 Likes