[LLD] Support DWARF64, debug_info "sorting"

This year Igor Kudrin put in a lot of work in enabling DWARF64 support in LLVM. At Facebook we are looking into it as one of the options for handling debug information over 4gigs in production environment. One concern is that due to mix of third party libraries and llvm compiled code the final library/binary will have a mix of CU that are DWARF32/64. This is supported by DWARF format. With this mix it is possible that even with DWARF64 enabled one can still encounter relocation overflows errors in LLD if DWARF32 sections happen to be processed towards the end.

One proposal that was discussed in https://reviews.llvm.org/D87011, is to modify LLD linker to arrange debug_info sections so that DWARF32 comes first, and DWARF64 after them. This way as long as DWARF32 sections don’t themselves go over 4gigs, the final binary can contain debug information that exceeds 4gig. Which I think will be the common case.

An alternative approach that was proposed by James Henderson is for build system to take care of it, and to use -u to enforce order.
As, I would imagine, most projects of scale are using configurable build system that pulls in all the various dependencies automatically in a multi-language environment. I think the alternative approach will be more fragile than modifying LLD as it relies on a more complex system, and each customer of LLD will have to implement this “sorting” in their own build systems. The use of -u also kind of abuses this flag, and might have unintended consequences. As was pointed out by Wen Lei.
From overhead perspective we only need to access few bytes of DWARF to determine if it’s 32 or 64 bits. Customers who need DWARF64, already accept the overhead that it entails.

Any thoughts?

Thank You
Alex

This year Igor Kudrin put in a lot of work in enabling DWARF64 support in LLVM. At Facebook we are looking into it as one of the options for handling debug information over 4gigs in production environment. One concern is that due to mix of third party libraries and llvm compiled code the final library/binary will have a mix of CU that are DWARF32/64. This is supported by DWARF format. With this mix it is possible that even with DWARF64 enabled one can still encounter relocation overflows errors in LLD if DWARF32 sections happen to be processed towards the end.

One proposal that was discussed in https://reviews.llvm.org/D87011, is to modify LLD linker to arrange debug_info sections so that DWARF32 comes first, and DWARF64 after them. This way as long as DWARF32 sections don’t themselves go over 4gigs, the final binary can contain debug information that exceeds 4gig. Which I think will be the common case.

An alternative approach that was proposed by James Henderson is for build system to take care of it, and to use -u to enforce order.

+Fangrui Song here for thread visibility

Of these two approaches I think that the linker sorting is probably the one I’d go with for the reasons you list below - I’m particularly sympathetic to not wanting the unintended consequences of using -u here :slight_smile:

I do worry about slowing down general debug links so a “debug info sorting” option may make sense, or it may not be worth it after measuring the speed difference.

Thanks for bringing this up on the list! :slight_smile:

-eric

+James for context too (always good to include the folks from the
original threads for continuity)

Yeah, my general attitude there was just twofold, one that the
discussion had strayed fairly far from the review (so interested
parties might not see it, both because it's a targeted review thread
on the noisy llvm-commits, and because fo the title not having much
connection to the discussion) and it seemed to be somewhat
abstract/general - and there's a balance there. "We should do this
because I need it" (we shouldn't be implementing features for
especially niche use cases/if they don't generalize) isn't always a
compelling motivation but "we should do this because someone might
need it" isn't either (we shouldn't be implementing features that have
no users).

The major drawback in sorting, is the need to parse DWARF, even a
little bit of it (only the first 4 bytes of a section to tell which
version it is - first 12 if you want to be able to jump over
contributions and check /all/ contributions coming from a given input
object file (it might contain a combination of DWARFv4 and DWARFv5)
and then the hairy uncertainty of which sections to check (do you
check them all? well, all the ones with length prefixes that
communicate DWARF32/64 - some sections don't
(debug_ranges/loc/str/macro for instance, if I recall correctly)...
and if something has some 4 and 5, does it get sorted to the start? I
guess so.

+James for context too (always good to include the folks from the
original threads for continuity)

Yeah, my general attitude there was just twofold, one that the
discussion had strayed fairly far from the review (so interested
parties might not see it, both because it’s a targeted review thread
on the noisy llvm-commits, and because fo the title not having much
connection to the discussion) and it seemed to be somewhat
abstract/general - and there’s a balance there. “We should do this
because I need it” (we shouldn’t be implementing features for
especially niche use cases/if they don’t generalize) isn’t always a
compelling motivation but “we should do this because someone might
need it” isn’t either (we shouldn’t be implementing features that have
no users).

The major drawback in sorting, is the need to parse DWARF, even a
little bit of it (only the first 4 bytes of a section to tell which
version it is - first 12 if you want to be able to jump over
contributions and check /all/ contributions coming from a given input
object file (it might contain a combination of DWARFv4 and DWARFv5)
and then the hairy uncertainty of which sections to check (do you
check them all? well, all the ones with length prefixes that
communicate DWARF32/64 - some sections don’t
(debug_ranges/loc/str/macro for instance, if I recall correctly)…
and if something has some 4 and 5, does it get sorted to the start? I
guess so.

I assume this comment is meant to say DWARF32/DWARF64, not DWARFv4 and DWARFv5, as the DWARF version (as opposed to the 32/64 bit style) is irrelevant to this, I believe, at least for the current known DWARF standards. Whilst the majority of objects will only have a single CU in them, there will be exceptions (LTO-generated objects, -r merged objects etc), so we do need to consider this approach. Mixtures would certainly be possible, and there’s no guarantee the CUs would be in a nice order with 32-bit blocks before 64-bit blocks. If I follow this to its full conclusion, you could potentially end up with a single .debug_info (.debug_line, .debug_rnglists etc) input section with a mixture of DWARF32/DWARF64 sub-sections, which, if following the reordering approach, the linker might have to split up internally in order to rearrange (aside - there’s some interesting crossover with ideas I’ve been considering regarding the Fragmented DWARF topic discussed elsewhere). Maybe the solution here would be to change producers to produce separate .debug_info sections containing DWARF32 and DWARF64. This would require other tools, like llvm-dwarfdump, to be updated too to handle multiple input .debug_info sections.

I used the -u option more as an example that it might be possible to get things to work the way we want without needing to have the linker do the work. The linker currently has a --symbol-ordering-file option which can be used to request an order for the specified list of symbols. The linker does this by rearranging the input sections to get as close as it can to the requested order. We could maybe implement the same on a file/section basis. It would avoid needing to read the sections themselves, but doesn’t solve the “what to do about mixed single input” case directly (though might allow the user to dodge the decision at least).

Other ideas I had involved changing the section header properties. Currently DWARF sections are all SHT_PROGBITS, but we could change that to e.g. SHT_DWARF_32 or similar, and/or use the sh_info field to contain a value that would indicate the 32/64 bit nature. I’m not convinced by these ideas though, as a) I don’t know if it translates well to other non-ELF formats, and b) we can’t really control the producers of DWARF at this stage to conform.

It would be nice if there was a solution that could be consistently applied across all build systems, linkers and DWARF producers. I don’t have one as yet though.

+James for context too (always good to include the folks from the
original threads for continuity)

Yeah, my general attitude there was just twofold, one that the
discussion had strayed fairly far from the review (so interested
parties might not see it, both because it's a targeted review thread
on the noisy llvm-commits, and because fo the title not having much
connection to the discussion) and it seemed to be somewhat
abstract/general - and there's a balance there. "We should do this
because I need it" (we shouldn't be implementing features for
especially niche use cases/if they don't generalize) isn't always a
compelling motivation but "we should do this because someone might
need it" isn't either (we shouldn't be implementing features that have
no users).

The major drawback in sorting, is the need to parse DWARF, even a
little bit of it (only the first 4 bytes of a section to tell which
version it is - first 12 if you want to be able to jump over
contributions and check /all/ contributions coming from a given input
object file (it might contain a combination of DWARFv4 and DWARFv5)
and then the hairy uncertainty of which sections to check (do you
check them all? well, all the ones with length prefixes that
communicate DWARF32/64 - some sections don't
(debug_ranges/loc/str/macro for instance, if I recall correctly)...
and if something has some 4 and 5, does it get sorted to the start? I
guess so.

I assume this comment is meant to say DWARF32/DWARF64, not DWARFv4 and DWARFv5, as the DWARF version (as opposed to the 32/64 bit style) is irrelevant to this, I believe, at least for the current known DWARF standards.

Yep! thanks for the correction - had a lot of DWARFv4/v5 on my mind
due to other work, so got the terms jumbled up.

Whilst the majority of objects will only have a single CU in them, there will be exceptions (LTO-generated objects, -r merged objects etc), so we do need to consider this approach. Mixtures would certainly be possible, and there's no guarantee the CUs would be in a nice order with 32-bit blocks before 64-bit blocks. If I follow this to its full conclusion, you could potentially end up with a single .debug_info (.debug_line, .debug_rnglists etc) input section with a mixture of DWARF32/DWARF64 sub-sections, which, if following the reordering approach, the linker might have to split up internally in order to rearrange (aside - there's some interesting crossover with ideas I've been considering regarding the Fragmented DWARF topic discussed elsewhere).

I think given this is a pragmatic feature I'd be inclined to say "eh,
sort any input object containing at least one DWARFv4 contribution
before input objects not containing any v4 contribution" - if that
doesn't solve some real world issues/situations, I'd be willing to
revisit this direction/consider more invasive/expensive solutions.

Though, as Eric said - some of this conversation might be better had
in terms of concrete patches with concrete performance measurements.

Maybe the solution here would be to change producers to produce separate .debug_info sections containing DWARF32 and DWARF64.

That'd involve changing how certain objects were generated - if that's
possible, then I assume it'd be possible to change that generation to
use DWARF64 anyway - in the limit: one might have precompiled binaries
with debug info that one cannot recompile, so any new format options I
doubt are able to address the original/likely use case for this
functionality.

From: llvm-dev <llvm-dev-bounces@lists.llvm.org> On Behalf Of David
Blaikie via llvm-dev
Sent: Wednesday, November 11, 2020 12:46 PM
To: James Henderson <jh7370.2008@my.bristol.ac.uk>
Cc: llvm-dev@lists.llvm.org
Subject: Re: [llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

>
>
>
>>
>> +James for context too (always good to include the folks from the
>> original threads for continuity)
>>
>> Yeah, my general attitude there was just twofold, one that the
>> discussion had strayed fairly far from the review (so interested
>> parties might not see it, both because it's a targeted review thread
>> on the noisy llvm-commits, and because fo the title not having much
>> connection to the discussion) and it seemed to be somewhat
>> abstract/general - and there's a balance there. "We should do this
>> because I need it" (we shouldn't be implementing features for
>> especially niche use cases/if they don't generalize) isn't always a
>> compelling motivation but "we should do this because someone might
>> need it" isn't either (we shouldn't be implementing features that have
>> no users).
>>
>> The major drawback in sorting, is the need to parse DWARF, even a
>> little bit of it (only the first 4 bytes of a section to tell which
>> version it is - first 12 if you want to be able to jump over
>> contributions and check /all/ contributions coming from a given input
>> object file (it might contain a combination of DWARFv4 and DWARFv5)
>> and then the hairy uncertainty of which sections to check (do you
>> check them all? well, all the ones with length prefixes that
>> communicate DWARF32/64 - some sections don't
>> (debug_ranges/loc/str/macro for instance, if I recall correctly)...
>> and if something has some 4 and 5, does it get sorted to the start? I
>> guess so.
>>
> I assume this comment is meant to say DWARF32/DWARF64, not DWARFv4 and
DWARFv5, as the DWARF version (as opposed to the 32/64 bit style) is
irrelevant to this, I believe, at least for the current known DWARF
standards.

Yep! thanks for the correction - had a lot of DWARFv4/v5 on my mind
due to other work, so got the terms jumbled up.

> Whilst the majority of objects will only have a single CU in them,
there will be exceptions (LTO-generated objects, -r merged objects etc),
so we do need to consider this approach. Mixtures would certainly be
possible, and there's no guarantee the CUs would be in a nice order with
32-bit blocks before 64-bit blocks. If I follow this to its full
conclusion, you could potentially end up with a single .debug_info
(.debug_line, .debug_rnglists etc) input section with a mixture of
DWARF32/DWARF64 sub-sections, which, if following the reordering approach,
the linker might have to split up internally in order to rearrange (aside
- there's some interesting crossover with ideas I've been considering
regarding the Fragmented DWARF topic discussed elsewhere).

I think given this is a pragmatic feature I'd be inclined to say "eh,
sort any input object containing at least one DWARFv4 contribution
before input objects not containing any v4 contribution" - if that
doesn't solve some real world issues/situations, I'd be willing to
revisit this direction/consider more invasive/expensive solutions.

I was under the impression that *object* order meant a lot to people,
and changing that would have all sorts of unpleasant fallout. If I'm
remember that correctly, sorting DWARF sections really should be its
own thing, separate from object order. Shoving DWARF-64 sections to
the end of the line seems like it would be less problematic than
reordering entire objects, if the linker can handle that in some
reasonably efficient way.
--paulr

(Adding back Cc: which got dropped)

(Igor - I don't know what happened, but your email split the mail thread in gmail for me.)

The problem is that [llvm-dev] [LLD] Support DWARF64, debug_info "sorting" does not have an In-Reply-To: header.
Added Igor to the Cc: list.

If we go down the route (sorting DWARF64 after DWARF32), compared with a
lightweight parse, I'd prefer the relocation based approach: if a .debug_* has
an 64-bit absolute relocation type (e.g. R_X86_64_64).

In LLD, for an input section, we don't know its associated SHT_REL[A] section.
So when adding an orphan section we would have another loop iterating
over inputSections. We can reuse the dependentSections to have this
piece of information (generalizing the existing special case for -r/--emit-relocs)

This way as long as DWARF32 sections don't themselves go over 4gigs, the final binary can contain debug information that exceeds 4gig.
Which I think will be the common case.

I would not expect the linker behaves differently when linking a few additional sections change the behavior so drastically
in a not-easily-explainable way. This deserves a dedicated linker option (see below, I have a concern about the inconsistency
with an input section description)

From: llvm-dev <llvm-dev-bounces@lists.llvm.org> On Behalf Of David
Blaikie via llvm-dev
Sent: Wednesday, November 11, 2020 12:46 PM
To: James Henderson <jh7370.2008@my.bristol.ac.uk>
Cc: llvm-dev@lists.llvm.org
Subject: Re: [llvm-dev] [LLD] Support DWARF64, debug_info "sorting"

>
>>
>> +James for context too (always good to include the folks from the
>> original threads for continuity)
>>
>> Yeah, my general attitude there was just twofold, one that the
>> discussion had strayed fairly far from the review (so interested
>> parties might not see it, both because it's a targeted review thread
>> on the noisy llvm-commits, and because fo the title not having much
>> connection to the discussion) and it seemed to be somewhat
>> abstract/general - and there's a balance there. "We should do this
>> because I need it" (we shouldn't be implementing features for
>> especially niche use cases/if they don't generalize) isn't always a
>> compelling motivation but "we should do this because someone might
>> need it" isn't either (we shouldn't be implementing features that have
>> no users).
>>
>> The major drawback in sorting, is the need to parse DWARF, even a
>> little bit of it (only the first 4 bytes of a section to tell which
>> version it is - first 12 if you want to be able to jump over
>> contributions and check /all/ contributions coming from a given input
>> object file (it might contain a combination of DWARFv4 and DWARFv5)
>> and then the hairy uncertainty of which sections to check (do you
>> check them all? well, all the ones with length prefixes that
>> communicate DWARF32/64 - some sections don't
>> (debug_ranges/loc/str/macro for instance, if I recall correctly)...
>> and if something has some 4 and 5, does it get sorted to the start? I
>> guess so.
>>
> I assume this comment is meant to say DWARF32/DWARF64, not DWARFv4 and
DWARFv5, as the DWARF version (as opposed to the 32/64 bit style) is
irrelevant to this, I believe, at least for the current known DWARF
standards.

Yep! thanks for the correction - had a lot of DWARFv4/v5 on my mind
due to other work, so got the terms jumbled up.

> Whilst the majority of objects will only have a single CU in them,
there will be exceptions (LTO-generated objects, -r merged objects etc),
so we do need to consider this approach. Mixtures would certainly be
possible, and there's no guarantee the CUs would be in a nice order with
32-bit blocks before 64-bit blocks. If I follow this to its full
conclusion, you could potentially end up with a single .debug_info
(.debug_line, .debug_rnglists etc) input section with a mixture of
DWARF32/DWARF64 sub-sections, which, if following the reordering approach,
the linker might have to split up internally in order to rearrange (aside
- there's some interesting crossover with ideas I've been considering
regarding the Fragmented DWARF topic discussed elsewhere).

I'm still learning the internals but would expect that mixed DWARF32/DWARF64 is
a problem for LTO. A reloctable link (-r) can combine DWARF32/DWARF64 object
files and potentially nullify the aforementioned relocation based approach
(we probably just want to check the first relocation to save time;
if we link DWARF64 before DWARF32 we may create a .debug_info
which looks like DWARF64 but is actually restricted by DWARF32 relocations)

I think given this is a pragmatic feature I'd be inclined to say "eh,
sort any input object containing at least one DWARFv4 contribution
before input objects not containing any v4 contribution" - if that
doesn't solve some real world issues/situations, I'd be willing to
revisit this direction/consider more invasive/expensive solutions.

I was under the impression that *object* order meant a lot to people,
and changing that would have all sorts of unpleasant fallout. If I'm
remember that correctly, sorting DWARF sections really should be its
own thing, separate from object order. Shoving DWARF-64 sections to
the end of the line seems like it would be less problematic than
reordering entire objects, if the linker can handle that in some
reasonably efficient way.
--paulr

This behavior does add some inconsistency to the system:

For an output section description .debug_info 0 : { *(.debug_info) } ,
should the linker sort DWARF32 and DWARF64 components? It it does, the behavior
will be inconsistent with other input section descriptions *(foo)

If there is a magic keyword, say, SORT_BY_MAGIC_DEBUG, and the internal
linker script does something similar to

   *(SORT_BY_MAGIC_DEBUG(.debug_info))

then the system is still consistent.

Though, as Eric said - some of this conversation might be better had
in terms of concrete patches with concrete performance measurements.

> Maybe the solution here would be to change producers to produce separate
.debug_info sections containing DWARF32 and DWARF64.

That'd involve changing how certain objects were generated - if that's
possible, then I assume it'd be possible to change that generation to
use DWARF64 anyway - in the limit: one might have precompiled binaries
with debug info that one cannot recompile, so any new format options I
doubt are able to address the original/likely use case for this
functionality.

> I used the -u option more as an example that it might be possible to get
things to work the way we want without needing to have the linker do the
work. The linker currently has a --symbol-ordering-file option which can
be used to request an order for the specified list of symbols. The linker
does this by rearranging the input sections to get as close as it can to
the requested order. We could maybe implement the same on a file/section
basis. It would avoid needing to read the sections themselves, but doesn't
solve the "what to do about mixed single input" case directly (though
might allow the user to dodge the decision at least).

Yeah, --symbol-ordering-file applies on both global and local symbols.
Unfortunately no symbols are defined relative to .debug_* sections
(if we don't consider the STT_SECTION symbols, which cannot be used
anyway because .debug_* do not have unique names).

(The usage of -u still requires the user to add archives (they want to
change order) before other object files. In LLD this requires ⚙ D81052 [ELF] Handle -u before input files )

> Other ideas I had involved changing the section header properties.
Currently DWARF sections are all SHT_PROGBITS, but we could change that to
e.g. SHT_DWARF_32 or similar, and/or use the sh_info field to contain a
value that would indicate the 32/64 bit nature. I'm not convinced by these
ideas though, as a) I don't know if it translates well to other non-ELF
formats, and b) we can't really control the producers of DWARF at this
stage to conform.

Inventing a new section type is not bad at a first glance. Leveraging it
can remove the inconsistency in the system as well.
Unfortunately linker scripts (as implemented by GNU ld and emulated by LLD)
don't provide a way to match input sections by section type.

If we are going to have many thoughts on the linker side design, might
be worth asking on https://groups.google.com/g/generic-abi as well.
That would have to a separate discussion because the list is moderated
and users who haven't joined the group cannot reply there. If there are
opinions, we can share them with llvm-dev.

Thanks for feedback.

I agree with patch and numbers this will be a more concrete discussion, but I wanted to judge overall receptiveness to this approach and see maybe there was a better way.

"Whilst the majority of objects will only have a single CU in them, there will be exceptions (LTO-generated objects, -r merged objects etc), so we do need to consider this approach."
David can you elaborate under which conditions LTO-generated objects will have a mix of DWARF32/64 in same .debug_info? Looking at how dwarf64 was implemented same flag will be used for the entirety of the dwarf output, even if multiple CUs are included.

I think if object does have a mix of CUs that are 32/64, linker can do a best effort ordering, and output a warning. My approach to this is from covering common cases while solving a problem with relocations overflow in large libraries/binaries.

@Fangrui Song
That’s a good point with relocations. Although is it always a guarantee a first one will be representative of entire relocation record?
For debug_info even with DWARF32 there can be 64bit relocations.
0000000000000c57 0000001800000001 R_X86_64_64 0000000000000000 .text._“some_mangeled_name” + 0

On one hand since this is only applicable for when DWARF64 is used, special option would be the way to go. Although the user will need to be aware of yet another LLD option. Maybe an error when relocations overflow occur can be modified to display this option along with -fdebug-types-section

Thank You
Alex

(Adding back Cc: which got dropped)

> (Igor - I don't know what happened, but your email split the mail thread in gmail for me.)

The problem is that [llvm-dev] [LLD] Support DWARF64, debug_info "sorting" does not have an In-Reply-To: header.
Added Igor to the Cc: list.

If we go down the route (sorting DWARF64 after DWARF32), compared with a
lightweight parse, I'd prefer the relocation based approach: if a .debug_* has
an 64-bit absolute relocation type (e.g. R_X86_64_64).

In LLD, for an input section, we don't know its associated SHT_REL[A] section.
So when adding an orphan section we would have another loop iterating
over inputSections. We can reuse the dependentSections to have this
piece of information (generalizing the existing special case for -r/--emit-relocs)

> This way as long as DWARF32 sections don't themselves go over 4gigs, the final binary can contain debug information that exceeds 4gig.
> Which I think will be the common case.

I would not expect the linker behaves differently when linking a few additional sections change the behavior so drastically
in a not-easily-explainable way. This deserves a dedicated linker option (see below, I have a concern about the inconsistency
with an input section description)

>
>
>> From: llvm-dev <llvm-dev-bounces@lists.llvm.org> On Behalf Of David
>> Blaikie via llvm-dev
>> Sent: Wednesday, November 11, 2020 12:46 PM
>> To: James Henderson <jh7370.2008@my.bristol.ac.uk>
>> Cc: llvm-dev@lists.llvm.org
>> Subject: Re: [llvm-dev] [LLD] Support DWARF64, debug_info "sorting"
>>
>> >
>> >
>> >
>> >>
>> >> +James for context too (always good to include the folks from the
>> >> original threads for continuity)
>> >>
>> >> Yeah, my general attitude there was just twofold, one that the
>> >> discussion had strayed fairly far from the review (so interested
>> >> parties might not see it, both because it's a targeted review thread
>> >> on the noisy llvm-commits, and because fo the title not having much
>> >> connection to the discussion) and it seemed to be somewhat
>> >> abstract/general - and there's a balance there. "We should do this
>> >> because I need it" (we shouldn't be implementing features for
>> >> especially niche use cases/if they don't generalize) isn't always a
>> >> compelling motivation but "we should do this because someone might
>> >> need it" isn't either (we shouldn't be implementing features that have
>> >> no users).
>> >>
>> >> The major drawback in sorting, is the need to parse DWARF, even a
>> >> little bit of it (only the first 4 bytes of a section to tell which
>> >> version it is - first 12 if you want to be able to jump over
>> >> contributions and check /all/ contributions coming from a given input
>> >> object file (it might contain a combination of DWARFv4 and DWARFv5)
>> >> and then the hairy uncertainty of which sections to check (do you
>> >> check them all? well, all the ones with length prefixes that
>> >> communicate DWARF32/64 - some sections don't
>> >> (debug_ranges/loc/str/macro for instance, if I recall correctly)...
>> >> and if something has some 4 and 5, does it get sorted to the start? I
>> >> guess so.
>> >>
>> > I assume this comment is meant to say DWARF32/DWARF64, not DWARFv4 and
>> DWARFv5, as the DWARF version (as opposed to the 32/64 bit style) is
>> irrelevant to this, I believe, at least for the current known DWARF
>> standards.
>>
>> Yep! thanks for the correction - had a lot of DWARFv4/v5 on my mind
>> due to other work, so got the terms jumbled up.
>>
>> > Whilst the majority of objects will only have a single CU in them,
>> there will be exceptions (LTO-generated objects, -r merged objects etc),
>> so we do need to consider this approach. Mixtures would certainly be
>> possible, and there's no guarantee the CUs would be in a nice order with
>> 32-bit blocks before 64-bit blocks. If I follow this to its full
>> conclusion, you could potentially end up with a single .debug_info
>> (.debug_line, .debug_rnglists etc) input section with a mixture of
>> DWARF32/DWARF64 sub-sections, which, if following the reordering approach,
>> the linker might have to split up internally in order to rearrange (aside
>> - there's some interesting crossover with ideas I've been considering
>> regarding the Fragmented DWARF topic discussed elsewhere).

I'm still learning the internals but would expect that mixed DWARF32/DWARF64 is
a problem for LTO. A reloctable link (-r) can combine DWARF32/DWARF64 object
files and potentially nullify the aforementioned relocation based approach
(we probably just want to check the first relocation to save time;
if we link DWARF64 before DWARF32 we may create a .debug_info
which looks like DWARF64 but is actually restricted by DWARF32 relocations)

>> I think given this is a pragmatic feature I'd be inclined to say "eh,
>> sort any input object containing at least one DWARFv4 contribution
>> before input objects not containing any v4 contribution" - if that
>> doesn't solve some real world issues/situations, I'd be willing to
>> revisit this direction/consider more invasive/expensive solutions.
>
>I was under the impression that *object* order meant a lot to people,
>and changing that would have all sorts of unpleasant fallout. If I'm
>remember that correctly, sorting DWARF sections really should be its
>own thing, separate from object order. Shoving DWARF-64 sections to
>the end of the line seems like it would be less problematic than
>reordering entire objects, if the linker can handle that in some
>reasonably efficient way.
>--paulr

This behavior does add some inconsistency to the system:

For an output section description .debug_info 0 : { *(.debug_info) } ,
should the linker sort DWARF32 and DWARF64 components? It it does, the behavior
will be inconsistent with other input section descriptions *(foo)

If there is a magic keyword, say, SORT_BY_MAGIC_DEBUG, and the internal
linker script does something similar to

   *(SORT_BY_MAGIC_DEBUG(.debug_info))

then the system is still consistent.

>>
>> Though, as Eric said - some of this conversation might be better had
>> in terms of concrete patches with concrete performance measurements.
>>
>> > Maybe the solution here would be to change producers to produce separate
>> .debug_info sections containing DWARF32 and DWARF64.
>>
>> That'd involve changing how certain objects were generated - if that's
>> possible, then I assume it'd be possible to change that generation to
>> use DWARF64 anyway - in the limit: one might have precompiled binaries
>> with debug info that one cannot recompile, so any new format options I
>> doubt are able to address the original/likely use case for this
>> functionality.
>>
>> > I used the -u option more as an example that it might be possible to get
>> things to work the way we want without needing to have the linker do the
>> work. The linker currently has a --symbol-ordering-file option which can
>> be used to request an order for the specified list of symbols. The linker
>> does this by rearranging the input sections to get as close as it can to
>> the requested order. We could maybe implement the same on a file/section
>> basis. It would avoid needing to read the sections themselves, but doesn't
>> solve the "what to do about mixed single input" case directly (though
>> might allow the user to dodge the decision at least).

Yeah, --symbol-ordering-file applies on both global and local symbols.
Unfortunately no symbols are defined relative to .debug_* sections
(if we don't consider the STT_SECTION symbols, which cannot be used
anyway because .debug_* do not have unique names).

(The usage of -u still requires the user to add archives (they want to
change order) before other object files. In LLD this requires ⚙ D81052 [ELF] Handle -u before input files )

>> > Other ideas I had involved changing the section header properties.
>> Currently DWARF sections are all SHT_PROGBITS, but we could change that to
>> e.g. SHT_DWARF_32 or similar, and/or use the sh_info field to contain a
>> value that would indicate the 32/64 bit nature. I'm not convinced by these
>> ideas though, as a) I don't know if it translates well to other non-ELF
>> formats, and b) we can't really control the producers of DWARF at this
>> stage to conform.

Inventing a new section type is not bad at a first glance. Leveraging it
can remove the inconsistency in the system as well.
Unfortunately linker scripts (as implemented by GNU ld and emulated by LLD)
don't provide a way to match input sections by section type.

If we are going to have many thoughts on the linker side design, might
be worth asking on https://groups.google.com/g/generic-abi as well.
That would have to a separate discussion because the list is moderated
and users who haven't joined the group cannot reply there. If there are
opinions, we can share them with llvm-dev.

I'm not sure/don't think this rises to that level - if a user is able
to regenerate their object files with some new object
feature/flag/attribute/etc, then they are probably able to generate
them with DWARF64. So this seems more about a linker doing something
that might help users who have DWARF32 backed into some precompiled
objects/libraries/things they otherwise can't change the way it's
built. So it seems to me it's more a linker-doing-something-nice than
linker/object files defining a new mode of interaction.

- Dave

Thanks for feedback.

I agree with patch and numbers this will be a more concrete discussion, but I wanted to judge overall receptiveness to this approach and see maybe there was a better way.

"Whilst the majority of objects will only have a single CU in them, there will be exceptions (LTO-generated objects, -r merged objects etc), so we do need to consider this approach."
David can you elaborate under which conditions LTO-generated objects will have a mix of DWARF32/64 in same .debug_info? Looking at how dwarf64 was implemented same flag will be used for the entirety of the dwarf output, even if multiple CUs are included.

I think if object does have a mix of CUs that are 32/64, linker can do a best effort ordering, and output a warning. My approach to this is from covering common cases while solving a problem with relocations overflow in large libraries/binaries.

@Fangrui Song<mailto:maskray@google.com>
That's a good point with relocations. Although is it always a guarantee a first one will be representative of entire relocation record?
For debug_info even with DWARF32 there can be 64bit relocations.
0000000000000c57 0000001800000001 R_X86_64_64 0000000000000000 .text._"some_mangeled_name" + 0

It may be weaker than "guaranteed": working in practice.

Let's look at sections that reference these large .debug_* sections (.debug_info, .debug_str, .debug_loclists, .debug_rnglists, ...):

* .debug_info: the first relocation references .debug_abbrev, good indicator
* .debug_names references .debug_info: the first relocation (CU offset) is a good indicator
* .debug_aranges references .debug_info: the first relocation (debug_info_offset) is a good indicator
* .debug_str_offsets references .debug_str: the first relocation (.debug_str offset) is a good indicator
* ...

So checking the first relocation is probably sufficient. Even if we miss
something, we can adjust the heuristic, or rather let the compiler generate an
artificial relocation (R_*_NONE), which will always work.

On one hand since this is only applicable for when DWARF64 is used, special option would be the way to go. Although the user will need to be aware of yet another LLD option. Maybe an error when relocations overflow occur can be modified to display this option along with -fdebug-types-section

I forgot to mention another drawback with .debug_* parsing. In the
presence of compressed debugging information, currently we uncompress
.debug_* on demand. We usually do it when writing the content of the
output section, which means we can potentially discard the uncompressed
buffers after we have done processing with one output section and move
to the next. This trick can potentially save peak memory usage.

However, if we do .debug_* parsing (to decide ordering among DWARF32/DWARF64),
we either cache the result (lose the trick) or end up uncompressing twice.
Neither is good.

I am quite happy with the relocation approach under a linker option. I'd still
want to know generic-abi folks's thoughts, though. James may have prepared something
he wants to share with generic-abi:) Let's wait...

Thanks for feedback.

I agree with patch and numbers this will be a more concrete discussion, but I wanted to judge overall receptiveness to this approach and see maybe there was a better way.

“Whilst the majority of objects will only have a single CU in them, there will be exceptions (LTO-generated objects, -r merged objects etc), so we do need to consider this approach.”
David can you elaborate under which conditions LTO-generated objects will have a mix of DWARF32/64 in same .debug_info? Looking at how dwarf64 was implemented same flag will be used for the entirety of the dwarf output, even if multiple CUs are included.

Thinking about it, I wouldn’t expect an LTO generated object itself to have a mixture of DWARF32/64, although I guess the 32/64 bit state could be encoded in the IR (I am not familiar enough with it to know if it actually is or not). It might be necessary to find ways to configure LTO to generate DWARF64, possibly via a link-time option.

On one hand since this is only applicable for when DWARF64 is used, special option would be the way to go. Although the user will need to be aware of yet another LLD option. Maybe an error when relocations overflow occur can be modified to display this option along with -fdebug-types-section

I am quite happy with the relocation approach under a linker option. I’d still
want to know generic-abi folks’s thoughts, though. James may have prepared something
he wants to share with generic-abi:) Let’s wait…

I hadn’t prepared anything if I’m honest (though if there’s widespread agreement that this would be useful, I certainly can - it would have other positive improvements too, reducing the need for tools to rely on section names to identify debug data for example). It was more a case of bouncing ideas off of people to see what they thought. Any discussion we have will probably also need circulating on the DWARF mailing list too, since it is more a DWARF issue than a gABI issue (unless the solution is a new section type). Further refinements to this idea that might make it more appealing to the generic group: SHT_DEBUG for the section type name, with the first N bytes of the sh_info used to specify the variant of debug data it represents (e.g. 0x1 for DWARF, 0x2 for SOME_OTHER_STANDARD etc), and the remainder for use as flags as defined by the standard (I’m thinking for DWARF you could encode the 64-bit/32-bit state in there, possibly the section variant (info/rnglists/line etc) and the DWARF version too), on the understanding that consumers like the linker wouldn’t combine sections in a potentially broken way. This has the advantage that it could be retrofitted to the existing standard versions, but as has been pointed out, this won’t help those with linker scripts - that could only be solved with a new DWARF standard and separate names for 64/32 bit sections, at least if we wanted to avoid the linker needing to do anything beyond reading the section header.

The relocation approach sounds like a reasonable solution for the current situation - even if we do decide to go the route of changing producers to start emitting a new section type/update the standard etc, it doesn’t resolve the problem people may currently face.

Object order means quite a lot, but it usually is only important for the loadable data, as it has cache implications. This isn’t an issue for debug data, as far as I understand it. Object order also has a number of other effects like what to do with COMDATs, weak symbol resolution, library inputs etc, but these are all link-time behaviour things, and once the right decisions (e.g. which input contributions to use) have been made, the linker could reorder the debug data as it wishes.

Looks like there is an agreement that this path, modifying lld to order sections using relocations, should be explored.
If Igor doesn’t object, since he was primary one driving DWARF64 so far, I would like to give it a shot at implementing and collecting some performance numbers. :slightly_smiling_face:

Alex

No objections, definitely. Please, go ahead.

Looks like there is an agreement that this path, modifying lld to order sections using relocations, should be explored.
If Igor doesn’t object, since he was primary one driving DWARF64 so far, I would like to give it a shot at implementing and collecting some performance numbers.

Alex


From: James Henderson <jh7370.2008@my.bristol.ac.uk>
Sent: Thursday, November 12, 2020 2:20 AM
To: Fangrui Song <maskray@google.com>
Cc: Alexander Yermolovich <ayermolo@fb.com>; Robinson, Paul <paul.robinson@sony.com>; David Blaikie <dblaikie@gmail.com>; Eric Christopher <echristo@gmail.com>; Igor Kudrin <ikudrin@accesssoftek.com>; llvm-dev@lists.llvm.org <llvm-dev@lists.llvm.org>
Subject: Re: [llvm-dev] [LLD] Support DWARF64, debug_info “sorting”

I probably should have mentioned that I had started a prototype:) (And I realized that I could use firstRelocation instead of dependentSections)
What I haven’t felt comfortable with is the input section description inconsistency

https://sourceware.org/pipermail/binutils/2020-November/114099.html

This behavior does add some inconsistency to the system:

For an output section description .debug_info 0 : { *(.debug_info) } , should the linker sort DWARF32 and DWARF64 components? It it does, the behavior will be inconsistent with other input section descriptions *(foo)

If there is a magic keyword, say, SORT_BY_MAGIC_DEBUG, and the internal
linker script does something similar to

*(SORT_BY_MAGIC_DEBUG(.debug_info))

then the system is still consistent.

I also started a thread on binutils side yesterday (sent in haste) https://sourceware.org/pipermail/binutils/2020-November/114099.html
(We should give then a chance for design and hope for a common option, at least getting a consensus even if the implementation on their side is of low priority)

Looks like there is an agreement that this path, modifying lld to order sections using relocations, should be explored.
If Igor doesn’t object, since he was primary one driving DWARF64 so far, I would like to give it a shot at implementing and collecting some performance numbers.

Alex


From: James Henderson <jh7370.2008@my.bristol.ac.uk>
Sent: Thursday, November 12, 2020 2:20 AM
To: Fangrui Song <maskray@google.com>
Cc: Alexander Yermolovich <ayermolo@fb.com>; Robinson, Paul <paul.robinson@sony.com>; David Blaikie <dblaikie@gmail.com>; Eric Christopher <echristo@gmail.com>; Igor Kudrin <ikudrin@accesssoftek.com>; llvm-dev@lists.llvm.org <llvm-dev@lists.llvm.org>
Subject: Re: [llvm-dev] [LLD] Support DWARF64, debug_info “sorting”

I probably should have mentioned that I had started a prototype:) (And I realized that I could use firstRelocation instead of dependentSections)
What I haven’t felt comfortable with is the input section description inconsistency

https://sourceware.org/pipermail/binutils/2020-November/114099.html

This behavior does add some inconsistency to the system:

For an output section description .debug_info 0 : { *(.debug_info) } , should the linker sort DWARF32 and DWARF64 components? It it does, the behavior will be inconsistent with other input section descriptions *(foo)

If there is a magic keyword, say, SORT_BY_MAGIC_DEBUG, and the internal
linker script does something similar to

*(SORT_BY_MAGIC_DEBUG(.debug_info))

then the system is still consistent.

I also started a thread on binutils side yesterday (sent in haste) https://sourceware.org/pipermail/binutils/2020-November/114099.html
(We should give then a chance for design and hope for a common option, at least getting a consensus even if the implementation on their side is of low priority)

Sent https://reviews.llvm.org/D91404 for the idea

Thinking about it, I wouldn’t expect an LTO generated object itself to have a mixture of DWARF32/64, although I guess the 32/64 bit state could be encoded in the IR (I am not familiar enough with it to know if it actually is or not). It might be necessary to find ways to configure LTO to generate DWARF64, possibly via a link-time option.

I don’t think we need to encode dwarf32/64 in IR as attribute for each module. We’re not going to emit mixed dwarf32/64 for merged LTO module anyways, so allowing each module to express its dwarf setting would only introduce burden for LTO to deal with inconsistency (warning?) among input modules. Having a linker switch to pass the setting from driver to LTO sounds better to me.

> Thinking about it, I wouldn't expect an LTO generated object itself to have a mixture of DWARF32/64, although I guess the 32/64 bit state could be encoded in the IR (I am not familiar enough with it to know if it actually is or not). It might be necessary to find ways to configure LTO to generate DWARF64, possibly via a link-time option.

I don’t think we need to encode dwarf32/64 in IR as attribute for each module. We’re not going to emit mixed dwarf32/64 for merged LTO module anyways, so allowing each module to express its dwarf setting would only introduce burden for LTO to deal with inconsistency (warning?) among input modules. Having a linker switch to pass the setting from driver to LTO sounds better to me.

Usually the issue there is that existing build systems may be setup
only to pass such flags to the compilations, and not to the link
invocations - like DWARF version, we pass that down through IR, emit
warnings/errors when two IR modules with different DWARF versions are
linked together, and then emit only one (the higher, I believe) DWARF
version out the other end.

We aren't 100% consistent on this "anything you could do without LTO,
you shuold be able to do with LTO/passing the same flags to the same
actions" kind of strategy (eg: type units and DWARF compression aren't
passed down through IR - if you want those you have to pass them to
the link invocation (via the clang driver) yourself). So it's more "is
there a systemic use of these flags already for the compilation and
would not supporting it there be a pain"? It's probably not for
DWARF64, since we haven't had any flag to support it at the moment
anyway.

- Dave

I got replies from Nick Clifton and Michael Matz:
https://sourceware.org/pipermail/binutils/2020-November/114116.html
(and its reply).
I have mentioned (a) the difficulty of the
detecting-DWARF64-by-first-relocation approach and (b) the section
type approach in my reply there
https://sourceware.org/pipermail/binutils/2020-November/114125.html

(a) My prototype has made me feel uneasy with this approach.

<quote>
In DWARF v4 or if .debug_str_offset is not used, it is a problem. A
heuristic is: if an input section in a file is marked DWARF64, we mark
all other .debug_* DWARF64. This makes me feel a bit uneasy because
for an output section description

   .debug_str 0 : { *(.debug_str) }

Now the behavior of `*` (or, if we invent a `SORT_*` keyword) is also
dependent on other output sections.
</quote>

(b)
* It needs a section type (either a gABI one or a SHT_GNU_* in GNU
ABI). Seeking for a gABI one is not that I think this is particularly
related to gABI but that I don't want Solaris (which LLVM also
supports) uses a different section type to unnecessarily cause
friction on our implementation
* It needs a clarification on multiple output section descriptions
with the same name.
* It needs a linker script feature to match input sections by type.

I got replies from Nick Clifton and Michael Matz:
How to sort mixed DWARF32 and DWARF64 .debug_*
(and its reply).
I have mentioned (a) the difficulty of the
detecting-DWARF64-by-first-relocation approach and (b) the section
type approach in my reply there
How to sort mixed DWARF32 and DWARF64 .debug_*

(a) My prototype has made me feel uneasy with this approach.

<quote>
In DWARF v4 or if .debug_str_offset is not used, it is a problem. A
heuristic is: if an input section in a file is marked DWARF64, we mark
all other .debug_* DWARF64. This makes me feel a bit uneasy because
for an output section description

   .debug_str 0 : { *(.debug_str) }

Now the behavior of `*` (or, if we invent a `SORT_*` keyword) is also
dependent on other output sections.
</quote>

(b)
* It needs a section type (either a gABI one or a SHT_GNU_* in GNU
ABI). Seeking for a gABI one is not that I think this is particularly
related to gABI but that I don't want Solaris (which LLVM also
supports) uses a different section type to unnecessarily cause
friction on our implementation

If I'm understawding you correrctly you're suggesting the sorting
behavior would only be implemented if the input object file had some
new attributes in it designating which sections are debug info
sections?

I don't think that's a viable solution to the problem at hand, then -
if someone is able to update their toolchain and rebuild objects with
new attributes, they can probably update the build configuration of
those objects to build them with DWARF64 instead, avoiding the mixed
32/64 problem. I think the solution we're looking for would have to
work with existing precompiled object files using DWARF32 that are in
the wild today, without modification.