[RFC] - Deduplication of debug information in linkers (LLD).

Hi all !

We have an issue with LLD, it is “relocation R_X86_64_32 out of range” (PR31109)
which occurs during resolving relocations in debug sections. It looks happens
because .debug_info section can be too large sometimes and 32x relocation is not enough
to represent the value. One of possible solutions looks to be to deduplicate information
to reduce .debug_info size.
The rest of mail contains information about experiments I did, the obtained results and
some questions and suggestions as well.

I was investigating idea to deduplicate debug types information. Idea is described at
p276 of DWARF4 specification (http://www.dwarfstd.org/doc/DWARF4.pdf). It suggests
to split types information out of .debug_info and emit multiple .debug_types sections
with use of COMDATs. Both clang and gcc I tested implements -fdebug-types-section flag for that:

-fdebug-types-section, -fno-debug-types-section
Place debug types in their own section (ELF Only)
gcc’s description is here: Using the GNU Compiler Collection (GCC): Debugging Options.

This flag is disabled by default. I compared clang binaries to see the difference

with and without the linker side optimisation.

  1. Clang built with -g has size of 1.7 GB, .debug_info section size is 894.5 Mb.
  2. Clang built with -g -fdebug-types-section has size of 1.0 GB.
    .debug_types size is 26.267 MB, .debug_info size is 227.7 MB.

Difference is huge and I believe shows (though probably for most of readers here it was
already obvious) that optimization can be useful. Though -fdebug-types-section is disabled by default.
Looks it was initially disabled because not all of DWARF consumers were aware of .debug_types section.

Now in 2017 situation is different. I think most of DWARF consumers knows about .debug_types, but:

  1. DWARF5 specification explicitly eliminates the .debug_types section introduced in DWARF4:
    p8, “1.4 Changes from Version 4 to Version 5” http://dwarfstd.org/doc/DWARF5.pdf
  2. Instead of emiting multiple .debug_types it suggests to emit multiple .debug_info COMDAT
    sections. (p375, p376).

And it seems currently there is no way to make clang to emit multiple .debug_info with type information
like DWARF5 suggests. I tried command line below:
-g -fdebug-types-section -gdwarf-5
It still emits .debug_types and does not look there is a flag for emiting multiple .debug_info.
Looking at whole LLVM code (lib/mc, lib/CodeGen) actually it seems it is just always assumed .debug_info is
a unique section in object.
(also not sure why clang emits .debug_types when -gdwarf-5 flag is set, as this section is incompatible with v5,
probably it is a bug).

So my questions are following:

  1. Do we want to try to implement multiple .debug_info approach ? As it seems can be very useful sometimes.
  2. For now in LLD may be we may want to extend our error message from “relocation X out of range” to something
    suggesting to use -fdebug-types-section (only for relocations in debug sections) ?
  3. Why -fdebug-types-section is disabled by default ?

Funny, I just filed a bug on that last night. Your solutions look like they’ll help me extensively as cutting the size if half will prevent my 80GB make install issues.
https://bugs.llvm.org/show_bug.cgi?id=35512

Thank you George for writing this up!

Hi all !

We have an issue with LLD, it is "relocation R_X86_64_32 out of range"
(PR31109)
which occurs during resolving relocations in debug sections. It looks
happens
because .debug_info section can be too large sometimes and 32x relocation
is not enough
to represent the value. One of possible solutions looks to be to
deduplicate information
to reduce .debug_info size.
The rest of mail contains information about experiments I did, the
obtained results and
some questions and suggestions as well.

I was investigating idea to deduplicate debug types information. Idea is
described at
p276 of DWARF4 specification (http://www.dwarfstd.org/doc/DWARF4.pdf). It
suggests
to split types information out of .debug_info and emit multiple
.debug_types sections
with use of COMDATs. Both clang and gcc I tested implements
-fdebug-types-section flag for that:

-fdebug-types-section, -fno-debug-types-section
Place debug types in their own section (ELF Only)
gcc's description is here: Using the GNU Compiler Collection (GCC): Top
Debugging-Options.html#Debugging-Options.

This flag is disabled by default. I compared clang binaries to see the
difference
with and without the linker side optimisation.
1) Clang built with -g has size of 1.7 GB, .debug_info section size is
894.5 Mb.
2) Clang built with -g -fdebug-types-section has size of 1.0 GB.
   .debug_types size is 26.267 MB, .debug_info size is 227.7 MB.

Difference is huge and I believe shows (though probably for most of
readers here it was
already obvious) that optimization can be useful. Though
-fdebug-types-section is disabled by default.
Looks it was initially disabled because not all of DWARF consumers were
aware of .debug_types section.

Now in 2017 situation is different. I think most of DWARF consumers knows
about .debug_types, but:
1) DWARF5 specification explicitly eliminates the .debug_types section
introduced in DWARF4:
   p8, "1.4 Changes from Version 4 to Version 5" http://dwarfstd.org/doc/
DWARF5.pdf
2) Instead of emiting multiple .debug_types it suggests to emit multiple
.debug_info COMDAT
   sections. (p375, p376).

And it seems currently there is no way to make clang to emit multiple
.debug_info with type information
like DWARF5 suggests. I tried command line below:
-g -fdebug-types-section -gdwarf-5
It still emits .debug_types and does not look there is a flag for emiting
multiple .debug_info.
Looking at whole LLVM code (lib/mc, lib/CodeGen) actually it seems it is
just always assumed .debug_info is
a unique section in object.
(also not sure why clang emits .debug_types when -gdwarf-5 flag is set, as
this section is incompatible with v5,
probably it is a bug).

So my questions are following:
1) Do we want to try to implement multiple .debug_info approach ? As it
seems can be very useful sometimes.
2) For now in LLD may be we may want to extend our error message from
"relocation X out of range" to something
   suggesting to use -fdebug-types-section (only for relocations in debug
sections) ?

What we ideally should do is to print out a hint message to add a flag to
force the compiler to emit DWARF64 debug info (I don't know the flag name)
and -fdebug-type-sections, along with a brief message describing why we
can't satisfy the user request (e.g. relocation X is too large for 32-bit
DWARF info). That said, looks like LLVM's DWARF64 support is incomplete
yet, so it may make sense to print out a hint message as you suggested.

3) Why -fdebug-types-section is disabled by default ?

Funny, I just filed a bug on that last night. Your solutions look like they’ll help me extensively as cutting the size if half will >prevent my 80GB make install issues.

If you’re interested in things you can do in the linker for this - you might consider something more aggressive: Fully DWARF aware deduplication.

This could be done hopefully by reusing some of the code in the dsymutil implementation in LLVM.

This would be much more effective (and without the possible context-sensitive tradeoffs) than using type units. Though it’d possibly have a big tradeoff in link time and/or linker memory usage (I’m not sure how much dsymutil needs/uses of either).

It doesn’t seem especially important to implement the DWARF5 types → debug_info thing for this situation, the type units as they are (in debug_types) offer the same size benefits here. But sure, if anyone wanted to implement it at some point, that’d be fine.

I think Paul covered some of the reasons type units might not be a reasonable default.

One additional reason is that if you use Split DWARF (another great way to massively reduce the amount of debug info going to the linker) type units are mostly /just/ overhead in the .dwo files: since the debug info is not linked, there’s no opportunity to remove the duplication anyway (unless you’re making a DWP - like a dsym file)

If you’re interested in things you can do in the linker for this - you might consider something more aggressive: Fully DWARF aware deduplication.

This could be done hopefully by reusing some of the code in the dsymutil implementation in LLVM.

This would be much more effective (and without the possible context-sensitive tradeoffs) than using type units.
Though it’d possibly have a big tradeoff in link time and/or linker memory usage (I’m not sure how much dsymutil needs/uses of either).

  • Rui.

I think LLD development direction vector currently is to avoid teaching linker about things it naturally should not be aware off.
Like it should ideally work with sections as pieces and should not know about content. That is not always possible,
for example we have to look inside .eh_frame to deuplicate FDEs, but that is probably what we would want to avoid in general.

It doesn’t seem especially important to implement the DWARF5 types → debug_info thing for this situation, the type units
as they are (in debug_types) offer the same size benefits here. But sure, if anyone wanted to implement it at some point, that’d be fine.

But there is no .debug_types in DWARF5, so it is depricated approach as far I understand.

I think Paul covered some of the reasons type units might not be a reasonable default.

One additional reason is that if you use Split DWARF (another great way to massively reduce the amount of debug info going to the linker)
type units are mostly /just/ overhead in the .dwo files: since the debug info is not linked, there’s no opportunity to remove the
duplication anyway (unless you’re making a DWP - like a >dsym file)

Yeah. Looks -gsplit-dwarf​ and -fdebug-types-section are harmfull together. Probably it worth to restrict using of them together or

emit a warning (both clang and gcc silently allows the combination and output has size penalty you describing).

But then does it make sence to emit multiple .debug_info sections with -gsplit-dwarf, so that objects will contain skeleton .debug_info and

.debug_info sections with type units as described in DWARF5. So that linker will be able to do deduplication of

types on a sections level as expected ?

George.

But then does it make sence to emit multiple .debug_info sections with -gsplit-dwarf, so that objects will contain skeleton .debug_info and

.debug_info sections with type units as described in DWARF5. So that linker will be able to do deduplication of

types on a sections level as expected ?

Looks that just would not work as skeleton CU has no children according to spec…

George.

But then does it make sence to emit multiple .debug_info sections with -gsplit-dwarf, so that objects will contain skeleton .debug_info and

.debug_info sections with type units as described in DWARF5. So that linker will be able to do deduplication of

types on a sections level as expected ?

Looks that just would not work as skeleton CU has no children according to spec…

George.

Ah, please ignore the above. Skeleton CU looks does not need children for that.

So theoretical scenario I meant was:

  1. test.o file has:

.debug_info skeleton

[0…N] .debug_info sections with types

  1. Then test.dwo would have full .debug_info with declarations of types

which definitions are still in test.o.

I am not sure if it is possible to represent and makes sence. Assuming it is possible,
and there is enough duplicate types, that would add some work for linker to deduplicate
sections, though it should be fast, and would increase output binary for size of deduplicated types.
But also would reduce size of whole set (executable + *.dwo), what can probably be useful.

George.

But then does it make sence to emit multiple .debug_info sections with -gsplit-dwarf, so that objects will contain skeleton .debug_info and

.debug_info sections with type units as described in DWARF5. So that linker will be able to do deduplication of

types on a sections level as expected ?

Looks that just would not work as skeleton CU has no children according to spec…

George.

Ah, please ignore the above. Skeleton CU looks does not need children for that.

So theoretical scenario I meant was:

  1. test.o file has:

.debug_info skeleton

[0…N] .debug_info sections with types

  1. Then test.dwo would have full .debug_info with declarations of types

which definitions are still in test.o.

I am not sure if it is possible to represent and makes sence. Assuming it is possible,
and there is enough duplicate types, that would add some work for linker to deduplicate
sections, though it should be fast, and would increase output binary for size of deduplicated types.
But also would reduce size of whole set (executable + *.dwo), what can probably be useful.

George.

After more time thinking about all above, I think it contradicts with core idea of -gsplit-dwarf​ and therefore unuseful.
Sorry for all that noise.

George.

If you’re interested in things you can do in the linker for this - you might consider something more aggressive: Fully DWARF aware deduplication.

This could be done hopefully by reusing some of the code in the dsymutil implementation in LLVM.

This would be much more effective (and without the possible context-sensitive tradeoffs) than using type units.

Though it’d possibly have a big tradeoff in link time and/or linker memory usage (I’m not sure how much dsymutil needs/uses of either).

  • Rui.

I think LLD development direction vector currently is to avoid teaching linker about things it naturally should not be aware off.

nod That’s been the historic ELF+DWARF approach, but both MacOS (with dsyms+DWARF) and Windows (COFF+CodeView+PDB) don’t do it that way, and instead involve the linker to a degree.

Mostly I’m wondering if it’d be reasonable to (and if anyone would be interested in doing it) do something more like the PDB support - fully debug-aware linking.

Like it should ideally work with sections as pieces and should not know about content. That is not always possible,
for example we have to look inside .eh_frame to deuplicate FDEs, but that is probably what we would want to avoid in general.

Yeah, I can totally understand that & it’s historically how it’s been done, so I’m not expecting a change there, just floating the idea.

It doesn’t seem especially important to implement the DWARF5 types → debug_info thing for this situation, the type units
as they are (in debug_types) offer the same size benefits here. But sure, if anyone wanted to implement it at some point, that’d be fine.

But there is no .debug_types in DWARF5, so it is depricated approach as far I understand.

Sure - but it works/is supported/is implemented. If someone wants to implement the newer thing, that’s cool, but I don’t have any personal motivation to do so for example. (& honestly we’ve been throwing around some ideas about how to further generalize the debug_info contributions to reduce some of the overhead of isolating types - so maybe if we’re lazy enough, we might leapfrog this particular state and just implement that future better thing)

I think Paul covered some of the reasons type units might not be a reasonable default.

One additional reason is that if you use Split DWARF (another great way to massively reduce the amount of debug info going to the linker)

type units are mostly /just/ overhead in the .dwo files: since the debug info is not linked, there’s no opportunity to remove the

duplication anyway (unless you’re making a DWP - like a >dsym file)

Yeah. Looks -gsplit-dwarf​ and -fdebug-types-section are harmfull together. Probably it worth to restrict using of them together or

emit a warning (both clang and gcc silently allows the combination and output has size penalty you describing).

Nah, only if you’re not producing a DWP at the end ( https://gcc.gnu.org/wiki/DebugFissionDWP ).

In short, I probably wouldn’t change any of LLVM’s defaults. But there are certainly flags people can use to reduce their debug info size.

You mentioned starting with this because LLVM’s defaults mean the DWARF is too large to link with DWARF 32 bit? How does gold cope with this? I haven’t seen failures/error messages/etc from either gold or lld related to this? (though I mostly use Split DWARF myself)

nod That’s been the historic ELF+DWARF approach, but both MacOS (with dsyms+DWARF) and Windows
(COFF+CodeView+PDB) don’t do it that way, and instead involve the linker to a degree.
Mostly I’m wondering if it’d be reasonable to (and if anyone would be interested in doing it) do
something more like the PDB support - fully debug-aware linking.

Honestly saying I only know how ELF linker works and may be my thoughts below are silly for some reason or duplicating
some already existent approach. Looking at what .dwp do, looks there are two main things reducing size debug data:

  1. “It must allow for the removal of duplicate type units”.
  2. “It must allow for the removal of duplicate strings”.

Linker already deduplicates strings by itself, though it can delegate it to some API for debug sections.
And what it could probably do is call some library API. Linker could give it a some set (or all of)
.debug_* sections so this library would rebuild and optimize the dwarf data, eliminate duplicates, and
return optimized debug sections back to linker. Then linker would perform relocations and emit the result to output.

That way library can be used for stand alone post proccessing tool probably
and linker should be able to work with data on a sections level only and be not DWARF aware.

Sure - but it works/is supported/is implemented. If someone wants to implement the newer thing, that’s cool, but I don’t have any
personal motivation to do so for example. (& honestly we’ve been throwing around some ideas about how to further generalize the
debug_info contributions to reduce some of the overhead of isolating types - so maybe if we’re lazy enough, we might leapfrog
this particular state and just implement that future better thing)

I see. Basing on all comments in this thread I am inclined to agree that implementing newer thing does not make much sence atm.

For now I prepared patch to error out when LLD faces objects with multiple .debug_* sections for cases when we do not support it.

(D40950​). (In LLD we are supporting deduplicating COMDATs, so generally such object is not a problem as already supported,
but for error reporting purposes and for --gdb-index we assume debug sections are unique in object,
so in that case we looks want to error out).

Have last thoughts/question about this though :slight_smile:

Currently clang -gdwarf-5 -fdebug-types-section​ works. And so linker can deduplicate types. Though that probably violates
specification saying there is no more .debug_type sections. But behavior is convinent for users of -fdebug-types-section.
I do not know how transition from v4 to v5 will happen/happens (or how transition between dwarf standarts usually happens).
I suppose one day clang just will start to produce v5 debug data by default.
And at the same time multiple .debug_info sections mentioned in DWARF5 spec as an optimization, so it should not be a mandatory
thing to implement. If so it just seems that either we will need to implement this optimization before switching to v5 by default or allow
-gdwarf-5 -fdebug-types-section to support existent use case​. And since it is already works and already allowed in releases it probably means it is
acceptable to keep (and use) this behavior ? (If so, attempt to leapfrog can be nice strategy IMO).

I think Paul covered some of the reasons type units might not be a reasonable default.

One additional reason is that if you use Split DWARF (another great way to massively reduce the amount of debug info going to the linker)
type units are mostly /just/ overhead in the .dwo files: since the debug info is not linked, there’s no opportunity to remove the
duplication anyway (unless you’re making a DWP - like a >dsym file)

Yeah. Looks -gsplit-dwarf​ and -fdebug-types-section are harmfull together. Probably it worth to restrict using of them together or
emit a warning (both clang and gcc silently allows the combination and output has size penalty you describing).

Nah, only if you’re not producing a DWP at the end ( DebugFissionDWP - GCC Wiki ).

Sure DWP do great job here it seems, but even for DWP use case flow it does not look make sence to force compiler to do excessive job
to produce types sections, ​because DWP producing tools probably should have no benefit from larger .dwo files with .debug_types at all I think.

I can only imagine now that somebody could use -gsplit-dwarf​ and -fdebug-types-section together so that can parse .debug_types.dwo
instead of parsing .debug_info.dwo to look for types in a bit more convinent way, but that looks too synthetic case.

In short, I probably wouldn’t change any of LLVM’s defaults. But there are certainly flags people can use to reduce their debug info size.

You mentioned starting with this because LLVM’s defaults mean the DWARF is too large to link with DWARF 32 bit? How does gold cope with this?
I haven’t seen failures/error messages/etc from either gold or lld related to this? (though I mostly use Split DWARF myself)

I posted some results earlier here: https://bugs.llvm.org//show_bug.cgi?id=31109#c3,
in short: gold 2.26.1 silently ignored this (probably produced broken output), and
newer versions of gold are able to report and catch the same error.

I think it is simply still not common to have such a large debug sections, we had only single bug about this so far. And hopefully
DWARF64 can be a solution, though it can just hide the issue, looks would be nice to reduce amount of debug data we produce still.

Best regards,
George | Developer | Access Softek, Inc

nod That’s been the historic ELF+DWARF approach, but both MacOS (with dsyms+DWARF) and Windows

(COFF+CodeView+PDB) don’t do it that way, and instead involve the linker to a degree.
Mostly I’m wondering if it’d be reasonable to (and if anyone would be interested in doing it) do
something more like the PDB support - fully debug-aware linking.

Honestly saying I only know how ELF linker works and may be my thoughts below are silly for some reason or duplicating
some already existent approach. Looking at what .dwp do, looks there are two main things reducing size debug data:

  1. “It must allow for the removal of duplicate type units”.
  2. “It must allow for the removal of duplicate strings”.

Linker already deduplicates strings by itself, though it can delegate it to some API for debug sections.
And what it could probably do is call some library API. Linker could give it a some set (or all of)
.debug_* sections so this library would rebuild and optimize the dwarf data, eliminate duplicates, and
return optimized debug sections back to linker. Then linker would perform relocations and emit the result to output.

Though probably resolving relocations can be a problem here. May be linker could pass already relocated sections for

final optimization/deduplication and some additional information probably, but anyways I see it can be not that simple now :slight_smile:

George.

nod That’s been the historic ELF+DWARF approach, but both MacOS (with dsyms+DWARF) and Windows
(COFF+CodeView+PDB) don’t do it that way, and instead involve the linker to a degree.
Mostly I’m wondering if it’d be reasonable to (and if anyone would be interested in doing it) do
something more like the PDB support - fully debug-aware linking.

Honestly saying I only know how ELF linker works and may be my thoughts below are silly for some reason or duplicating
some already existent approach. Looking at what .dwp do, looks there are two main things reducing size debug data:

  1. “It must allow for the removal of duplicate type units”.
  2. “It must allow for the removal of duplicate strings”.

Yeah, DWPs are mostly the same as a linker linking debug info without knowing anything aobut it. Except instead of relocations, it uses the cu/tu_index section (& str_index section). Otherwise the DWP packaging tool doesn’t know anything about the debug info (it doesn’t need to parse many DIEs, etc).

This is still simple/coarse grained compared to Windows PDBs or MacOS dsyms.

Linker already deduplicates strings by itself, though it can delegate it to some API for debug sections.

And what it could probably do is call some library API. Linker could give it a some set (or all of)
.debug_* sections so this library would rebuild and optimize the dwarf data, eliminate duplicates, and
return optimized debug sections back to linker. Then linker would perform relocations and emit the result to output.

That way library can be used for stand alone post proccessing tool probably
and linker should be able to work with data on a sections level only and be not DWARF aware.

Postprocessing (ie: running a tool on the fully linked binary with the debug info we have today, and having the tool reprocess the debug info to make it more compact) is an option, but wouldn’t help address the problem you started with - that the output can’t fit the large offsets, so the output is invalid/broken. So that output would be broken before the postprocessing step could run to compact things.

Sure - but it works/is supported/is implemented. If someone wants to implement the newer thing, that’s cool, but I don’t have any
personal motivation to do so for example. (& honestly we’ve been throwing around some ideas about how to further generalize the
debug_info contributions to reduce some of the overhead of isolating types - so maybe if we’re lazy enough, we might leapfrog
this particular state and just implement that future better thing)

I see. Basing on all comments in this thread I am inclined to agree that implementing newer thing does not make much sence atm.

For now I prepared patch to error out when LLD faces objects with multiple .debug_* sections for cases when we do not support it.

(D40950​). (In LLD we are supporting deduplicating COMDATs, so generally such object is not a problem as already supported,
but for error reporting purposes and for --gdb-index we assume debug sections are unique in object,
so in that case we looks want to error out).

Have last thoughts/question about this though :slight_smile:

Currently clang -gdwarf-5 -fdebug-types-section​ works. And so linker can deduplicate types. Though that probably violates
specification saying there is no more .debug_type sections. But behavior is convinent for users of -fdebug-types-section.
I do not know how transition from v4 to v5 will happen/happens (or how transition between dwarf standarts usually happens).
I suppose one day clang just will start to produce v5 debug data by default.
And at the same time multiple .debug_info sections mentioned in DWARF5 spec as an optimization, so it should not be a mandatory
thing to implement. If so it just seems that either we will need to implement this optimization before switching to v5 by default or allow
-gdwarf-5 -fdebug-types-section to support existent use case​. And since it is already works and already allowed in releases it probably means it is
acceptable to keep (and use) this behavior ? (If so, attempt to leapfrog can be nice strategy IMO).

I think Paul covered some of the reasons type units might not be a reasonable default.

One additional reason is that if you use Split DWARF (another great way to massively reduce the amount of debug info going to the linker)
type units are mostly /just/ overhead in the .dwo files: since the debug info is not linked, there’s no opportunity to remove the
duplication anyway (unless you’re making a DWP - like a >dsym file)

Yeah. Looks -gsplit-dwarf​ and -fdebug-types-section are harmfull together. Probably it worth to restrict using of them together or
emit a warning (both clang and gcc silently allows the combination and output has size penalty you describing).

Nah, only if you’re not producing a DWP at the end ( https://gcc.gnu.org/wiki/DebugFissionDWP ).

Sure DWP do great job here it seems, but even for DWP use case flow it does not look make sence to force compiler to do excessive job
to produce types sections, ​because DWP producing tools probably should have no benefit from larger .dwo files with .debug_types at all I think.

The current DWP tools (one in binutils, one in LLVM) don’t do DWARF-aware debug info compaction. They just concatenate the sections together, deduplicate strings, deduplicate type units.

So, yes, to have a smaller DWP file in the end it’s beneficial to use type units (be they in debug_types or debug_info).

But a fancier DWP tool that would process all the DWARF and compact the result wouldn’t need explicit type units & could avoid that overhead.

nod That’s been the historic ELF+DWARF approach, but both MacOS (with dsyms+DWARF) and Windows
(COFF+CodeView+PDB) don’t do it that way, and instead involve the linker to a degree.
Mostly I’m wondering if it’d be reasonable to (and if anyone would be interested in doing it) do
something more like the PDB support - fully debug-aware linking.

Honestly saying I only know how ELF linker works and may be my thoughts below are silly for some reason or duplicating
some already existent approach. Looking at what .dwp do, looks there are two main things reducing size debug data:

  1. “It must allow for the removal of duplicate type units”.
  2. “It must allow for the removal of duplicate strings”.

Yeah, DWPs are mostly the same as a linker linking debug info without knowing anything aobut it. Except instead of relocations, it uses the cu/tu_index section (& str_index section). Otherwise the DWP packaging tool doesn’t know anything about the debug info (it doesn’t need to parse many DIEs, etc).

This is still simple/coarse grained compared to Windows PDBs or MacOS dsyms.

Linker already deduplicates strings by itself, though it can delegate it to some API for debug sections.

And what it could probably do is call some library API. Linker could give it a some set (or all of)
.debug_* sections so this library would rebuild and optimize the dwarf data, eliminate duplicates, and
return optimized debug sections back to linker. Then linker would perform relocations and emit the result to output.

That way library can be used for stand alone post proccessing tool probably
and linker should be able to work with data on a sections level only and be not DWARF aware.

Postprocessing (ie: running a tool on the fully linked binary with the debug info we have today, and having the tool reprocess the debug info to make it more compact) is an option, but wouldn’t help address the problem you started with - that the output can’t fit the large offsets, so the output is invalid/broken. So that output would be broken before the postprocessing step could run to compact things.

Sure - but it works/is supported/is implemented. If someone wants to implement the newer thing, that’s cool, but I don’t have any
personal motivation to do so for example. (& honestly we’ve been throwing around some ideas about how to further generalize the
debug_info contributions to reduce some of the overhead of isolating types - so maybe if we’re lazy enough, we might leapfrog
this particular state and just implement that future better thing)

I see. Basing on all comments in this thread I am inclined to agree that implementing newer thing does not make much sence atm.

For now I prepared patch to error out when LLD faces objects with multiple .debug_* sections for cases when we do not support it.

(D40950​). (In LLD we are supporting deduplicating COMDATs, so generally such object is not a problem as already supported,
but for error reporting purposes and for --gdb-index we assume debug sections are unique in object,
so in that case we looks want to error out).

Have last thoughts/question about this though :slight_smile:

Currently clang -gdwarf-5 -fdebug-types-section​ works. And so linker can deduplicate types. Though that probably violates
specification saying there is no more .debug_type sections. But behavior is convinent for users of -fdebug-types-section.
I do not know how transition from v4 to v5 will happen/happens (or how transition between dwarf standarts usually happens).
I suppose one day clang just will start to produce v5 debug data by default.
And at the same time multiple .debug_info sections mentioned in DWARF5 spec as an optimization, so it should not be a mandatory
thing to implement. If so it just seems that either we will need to implement this optimization before switching to v5 by default or allow
-gdwarf-5 -fdebug-types-section to support existent use case​. And since it is already works and already allowed in releases it probably means it is
acceptable to keep (and use) this behavior ? (If so, attempt to leapfrog can be nice strategy IMO).

I think Paul covered some of the reasons type units might not be a reasonable default.

One additional reason is that if you use Split DWARF (another great way to massively reduce the amount of debug info going to the linker)
type units are mostly /just/ overhead in the .dwo files: since the debug info is not linked, there’s no opportunity to remove the
duplication anyway (unless you’re making a DWP - like a >dsym file)

Yeah. Looks -gsplit-dwarf​ and -fdebug-types-section are harmfull together. Probably it worth to restrict using of them together or
emit a warning (both clang and gcc silently allows the combination and output has size penalty you describing).

Nah, only if you’re not producing a DWP at the end ( https://gcc.gnu.org/wiki/DebugFissionDWP ).

Sure DWP do great job here it seems, but even for DWP use case flow it does not look make sence to force compiler to do excessive job
to produce types sections, ​because DWP producing tools probably should have no benefit from larger .dwo files with .debug_types at all I think.

The current DWP tools (one in binutils, one in LLVM) don’t do DWARF-aware debug info compaction. They just concatenate the sections together, deduplicate strings, deduplicate type units.

So, yes, to have a smaller DWP file in the end it’s beneficial to use type units (be they in debug_types or debug_info).

But a fancier DWP tool that would process all the DWARF and compact the result wouldn’t need explicit type units & could avoid that overhead.

Prior art is “dwz” written by Jakub Jelinek :slight_smile:

-eric

Postprocessing (ie: running a tool on the fully linked binary with the debug info we have today, and having the tool reprocess the debug info to make it more >compact) is an option, but wouldn’t help address the problem you started with - that the output can’t fit the large offsets, so the output is invalid/broken. So that >output would be broken before the postprocessing step could run to compact things.

Right. So then it could be some API that takes .debug_* sections from linker, takes relocations, additional info,

like info about GCed/ICFed sections. It could rebuild debug data, rebuild relocations and return it back to linker,

so it could take deduplicated debug info, perform updated relocations and produce output.

Does not feel nice honestly. It is definetely seems easier to do all of that on linker side instead.

George.

Postprocessing (ie: running a tool on the fully linked binary with the debug info we have today, and having the tool reprocess the debug info to make it more >compact) is an option, but wouldn’t help address the problem you started with - that the output can’t fit the large offsets, so the output is invalid/broken. So that >output would be broken before the postprocessing step could run to compact things.

Right. So then it could be some API that takes .debug_* sections from linker, takes relocations, additional info,

like info about GCed/ICFed sections. It could rebuild debug data, rebuild relocations and return it back to linker,

so it could take deduplicated debug info, perform updated relocations and produce output.

Right - this is what COFF does, I believe.

Does not feel nice honestly.

nod it’s certainly not the direction DWARF’s generally gone in, but I think we’re seeing some of the limitations of using a DWARF-agnostic linking strategy. All attempts I’ve seen to allow a linker to deduplicate DWARF without being DWARF aware have added a fair bit of overhead to the DWARF itself - admittedly there’s more options/improvements to be tested, but it still feels like we’d ultimately find it insufficient & want to go further than is possible while the linker remains unaware of DWARF.

It is definetely seems easier to do all of that on linker side instead.

Not quite sure what you mean by “on linker side” - but I guess you mean using linker features like comdats etc, rather than DWARF parsing/reassembly/etc.

Not quite sure what you mean by “on linker side” - but I guess you mean using linker features like comdats etc, rather than DWARF parsing/reassembly/etc.

I mean that it probably not a good idea for external library. I feel it is much more convinent to do such proccessing in a linker.

Linker do and knows much more about things like sections that are ICFed, eliminated, about COMDATs and many things like strings dedups.

So it can rebuild relocations and perform DWARF deduplication probably in a faster/more convinent way probably than external API could provide.

Honestly I did not yet think about it too much, just current feelings.

Geоrge.

Not quite sure what you mean by “on linker side” - but I guess you mean using linker features like comdats etc, rather than DWARF parsing/reassembly/etc.

I mean that it probably not a good idea for external library. I feel it is much more convinent to do such proccessing in a linker.

Ah, a matter of where the logic is implemented.

My concern is that we already have one use of this logic (dsymutil) and at least two other places that could benefit from it (lld and dwp) - it would seem unfortunate to build and maintain 3 separate versions of this.

(also, possibly one day, a 4th place: llvm IR linking - it could be useful to generate type DWARF (& CodeView) from frontends rather than backends, then do a full DWARF aware linking of it in the llvm linking step)

But all of that’s probably a bit of a long way off I imagine, in terms of getting the initial idea (should lld grow/have DWARF aware linking functionality) going.

Linker do and knows much more about things like sections that are ICFed, eliminated, about COMDATs and many things like strings dedups.

So it can rebuild relocations and perform DWARF deduplication probably in a faster/more convinent way probably than external API could provide.

Honestly I did not yet think about it too much, just current feelings.

Fair, fair.

  • Dave

Wasn’t our (lld/ELF’s) position on debug info size that we should focus on providing a great split-dwarf workflow and not try go too far out of our way to deduplicate or otherwise reduce debug info size inside LLD? I recall there being some patches that made linking of large debug binaries like 1.5GB+ clang faster, but we decided to reject those changes because split-dwarf was the “right” solution.

Rafael, Rui?

(I even recall Rafael saying at one point that a great split-dwarf workflow was one of the key things he considered as necessary for him to consider LLD “done”)

– Sean Silva

Wasn’t our (lld/ELF’s) position on debug info size that we should focus on providing a great split-dwarf workflow and not try go too far out of our way to deduplicate >or otherwise reduce debug info size inside LLD? I recall there being some patches that made linking of large debug binaries like 1.5GB+ clang faster, but we decided to >reject those changes because split-dwarf was the “right” solution.