DebugInfo proposal: Emit an explicit empty address range on CUs with no code addresses

So I’ve been looking at a particular performance problem with LLVM’s symbolizer due to the use of ThinLTO, split DWARF, and split DWARF inlining info.

This combination has a couple of problems:

  1. it means multiple CUs in a single DWO, which isn’t well defined/specified, and best avoided - so I’m working on fixing that here (won’t fix split DWARF+Full LTO) because we already don’t use cross-CU references in the split units (because there’s no supported way to express that in DWARF), so we clone/move any DIEs (like subprograms) referenced cross-CU into the CU that references them (eg: cross-CU inlining places the abstract subprogram definition for the inlined subroutine into the CU that has the inlining - rather than cross-CU referencing into the other CU)) - and in ThinLTO the only reason other units exist is to cross-CU optimize/inline, no code for imported CUs is ever emitted (except where it’s been inlined) - so a ThinLTO compile has one primary unit, and some other units it inlines from - so those other units never emit anything in the split unit, just a few DIEs in the skeleton unit if you’re using split DWARF inlining (or no unit at all if you aren’t using that feature) - so I’m working on making it so those units are non-split (rather than having a degenerate/empty split unit)

  2. symbolizer performance is hurt because whenever it sees a unit without ranges at the unit DIE, it assumes the producer just skipped those - and goes searching through the implementation DIEs (which may mean going over to the .dwo, or loading a whole .dwp) to see where their addresses are.

It’s this second step that’s a bit painfully unnecessary, especially for a large DWP on a remote filesystem, etc.

So, anyone have opinions on whether we should

a) decide that a unit without ranges covers no ranges - and don’t do the search

b) emit zero-length ranges on any unit that has no code ranges (low/high pc zero? Could pick anything, but that seems the most obvious)

Thanks,

  • Dave

So I've been looking at a particular performance problem with LLVM's symbolizer due to the use of ThinLTO, split DWARF, and split DWARF inlining info.

This combination has a couple of problems:

1) it means multiple CUs in a single DWO, which isn't well defined/specified, and best avoided - so I'm working on fixing that here (won't fix split DWARF+Full LTO) because we already don't use cross-CU references in the split units (because there's no supported way to express that in DWARF), so we clone/move any DIEs (like subprograms) referenced cross-CU into the CU that references them (eg: cross-CU inlining places the abstract subprogram definition for the inlined subroutine into the CU that has the inlining - rather than cross-CU referencing into the other CU)) - and in ThinLTO the only reason other units exist is to cross-CU optimize/inline, no code for imported CUs is ever emitted (except where it's been inlined) - so a ThinLTO compile has one primary unit, and some other units it inlines from - so those other units never emit anything in the split unit, just a few DIEs in the skeleton unit if you're using split DWARF inlining (or no unit at all if you aren't using that feature) - so I'm working on making it so those units are non-split (rather than having a degenerate/empty split unit)

2) symbolizer performance is hurt because whenever it sees a unit without ranges at the unit DIE, it assumes the producer just skipped those - and goes searching through the implementation DIEs (which may mean going over to the .dwo, or loading a whole .dwp) to see where their addresses are.

It's this second step that's a bit painfully unnecessary, especially for a large DWP on a remote filesystem, etc.

So, anyone have opinions on whether we should

a) decide that a unit without ranges covers no ranges - and don't do the search

Are there compilers that do this ("forget" to emit ranges) that we care to support with llvm-symbolizer?

-- adrian

So I’ve been looking at a particular performance problem with LLVM’s symbolizer due to the use of ThinLTO, split DWARF, and split DWARF inlining info.

This combination has a couple of problems:

  1. it means multiple CUs in a single DWO, which isn’t well defined/specified, and best avoided - so I’m working on fixing that here (won’t fix split DWARF+Full LTO) because we already don’t use cross-CU references in the split units (because there’s no supported way to express that in DWARF), so we clone/move any DIEs (like subprograms) referenced cross-CU into the CU that references them (eg: cross-CU inlining places the abstract subprogram definition for the inlined subroutine into the CU that has the inlining - rather than cross-CU referencing into the other CU)) - and in ThinLTO the only reason other units exist is to cross-CU optimize/inline, no code for imported CUs is ever emitted (except where it’s been inlined) - so a ThinLTO compile has one primary unit, and some other units it inlines from - so those other units never emit anything in the split unit, just a few DIEs in the skeleton unit if you’re using split DWARF inlining (or no unit at all if you aren’t using that feature) - so I’m working on making it so those units are non-split (rather than having a degenerate/empty split unit)

  2. symbolizer performance is hurt because whenever it sees a unit without ranges at the unit DIE, it assumes the producer just skipped those - and goes searching through the implementation DIEs (which may mean going over to the .dwo, or loading a whole .dwp) to see where their addresses are.

It’s this second step that’s a bit painfully unnecessary, especially for a large DWP on a remote filesystem, etc.

So, anyone have opinions on whether we should

a) decide that a unit without ranges covers no ranges - and don’t do the search

Are there compilers that do this (“forget” to emit ranges) that we care to support with llvm-symbolizer?

I’m not specifically aware of any, though haven’t gone looking.

Just in case this wasn’t obvious in the sub-text:
I think we should figure out whether this assumption in llvm-symbolizer is actually needed to support a compiler we care about and then potentially remove it, or enforce it only when the CU is < DWARF 5 or something like that.

– adrian

Yeah, fair - I’ll give it a week or something, see if Paul or anyone else has ideas about why the existing behavior might be useful before I remove it.

Looks like I argued (& then tested) previously for support for the case where the CU has no ranges, but sub-DIEs do: http://lists.llvm.org/pipermail/llvm-dev/2017-November/119131.html

(Just for the record, LLVM gained support for CU ranges were implemented r197776, December 2013 (& shortly after that became the default in r203968, March 2014 - in the 3.5 release) - looks like GCC got this somewhere between GCC 4.1 and GCC 4.4 according to godbolt testing, so on/before March 2012 I think)

So I’ve gone ahead and committed this change in r349333 - open to further discussion, reverting it, etc.

  • Dave

Seems reasonable to me.

–paulr