Speaking of referring to type stubs, we would end up referring to a type tag that has DW_AT_declaration set to true. The .debug_names spec states:
All non-defining declarations (that is, debugging information entries with a DW_AT_declaration attribute) are excluded.
The .debug_names section should point us to the real declaration and if it canât, just donât add it to the .debug_names.
Just not output type units at all into .debug_names. Consumers can parse all of type units and use type_stubs in .debug_names to find correct type information. Question for LLDB folks is how much will it degrade the usefulness of .debug_names for them.
LLDB keeps track of which DWARF units are included in .debug_names by checking the CUs, TUs and anything in the debug info that arenât mentioned in any .debug_names tables will get indexed manually. If it is really hard to add correct and full indexing of type units, it would be a good start to just emit a .debug_names table for the CUs for now and leave any and all indexing of TUs (local and foreign) out for now, but it would be great to at least get something when type units are enabled.
LLDB wants full indexing of information within the type units, so what we donât want is end up with single .debug_names entry for a type that only points to the main type in the type unit. If we have end up with a solution like this, then LLDB will detect we have an entry for the TU and it wonât manually index it to discover the contained types within the type in the type unit. For example all STL collection classes make many defines inside of the class itself like âiteratorâ, âconst_iteratorâ, âreverse_iteratorâ, etc. If no one ever creates a local variable that uses any of these types, then the only definitions for them can be contained only inside the type unit, so we really need to be able to find these types when evaluating expressions, so the the type units need to be fully indexed.
So I would vote to avoid adding any entries into .debug_names that point the type stubs in the compile unit output as these violate the definition of what is to be contained in the .debug_names table and it isnât really useful as LLDB will still need to manually index the type unit and we will end up ignoring these entries anyway.
Each module will have type unit information. When linker de-duplicates type units using COMDAT it can put a tombstone value in to the TU list for equivalent TUs (same hash), or the address of section kept if bit identical. If it is tombstone value, how would LLDB handle such a case when it searches for type units? Would it have to re-parse everything anyway?
What would actually get tombstoned here? The TU offset in the TUs table in the header of the .debug_names table? Or the DIE offset in the actual entry itself? LLDB can be taught to ignore either of these entries, so it shouldnât be a problem, it will just cause lookups to be slower as we will need to sift through dead entries, but we already do this for functions, so it can easily be added if needed.
LLDB will actually probably still end up manually indexing any type units that donât have full indexes.
Smart linker approach. We put TU entries into their own section and make it depend on the TU section (like relocation section tied to section it is applied to). When linker drops TU section it also drops corresponding .debug_names.tu section. From remaining it will construct TU list. Although not sure how this will work with whole bucket concept, so we might end up with one TU per module.
Could we create a separate fully indexed .debug_names section for each TU that references the current TU in the .o file and tie this section to the COMDAT section for the TU contents and donât emit either of them if if should be removed?
I like any solution that is a bit smarter about what we emit over a solution that just concatenates and relocates/tombstones things.
For split-dwarf:
- Use
DW_IDX_type_unit
and DW_IDX_compile_unit
on assumption it will all remain in .dwo files. Consumers will need to ignore it when DWP is present. Kind of like what they do now anyway.
Shouldnât there a .debug_names section in the main executable that has everything all relocated and correct for the contents of the .dwp file? Or is there a .debug_names section in the main executable and each CU or TU has offsets that are only valid from the original .dwo file? Not sure how this works.