[RFC] Delay definition DIE searching when parse a declaration DIE for record type

Currently, lldb eagerly searches for definition DIE when parsing a declaration DIE for struct/class/union/enum types: 1, 2. It will search for all definition DIEs with the same unqualified name (just DW_AT_name ) and then find out those DIEs with same fully qualified name. Then lldb will try to resolve those DIEs to create the Types from definition DIEs. It works fine most time.

However, when built with -gsimple-template-names, the search graph expands very quickly, because for the specialized-template classes, they don’t have template parameter names encoded inside DW_AT_name. They have DW_TAG_template_type_parameter to reference the types used as template parameters. In order to identify if a definition DIE matches a declaration DIE, lldb needs to resolve all template parameter types first and those template parameter types might be template classes as well, and so on… So, the search graph explodes, causing a lot unnecessary searching/type-resolving to just get the fully qualified names for a specialized-template class. This causes lldb stack overflow for us internally on template-heavy libraries. To solve this problem, @dblaikie is working on constructing the simplified template class names without searching for the definition DIEs, which could avoid those unnecessary searching and type-resolving.

Suppose @dblaikie’s patch is in, lldb’s performance will be improved. I think it’s still worthwhile to delay the definition DIE searching when just parsing a declaration DIE and only search for the definition DIE when asked to complete the type: 3. The rational for this is that searching for the definition DIE when parsing a declaration DIE isn’t necessary (correct me if this is incorrect). All it does is just creating the skeleton record decl and lldb_private::Type from the definition DIE. We could do the same with a declaration DIE.

The basic idea is:

  1. Constructing the record decl and Type when parsing a struct/class/union/enum DIE for the first time, even if it’s just a declaration DIE. Use a map to record its unique name (plus declaration location and byte size if it’s a definition DIE) to this DIE and the Type created from it. (This map is roughly the same as the existing UniqueDWARFASTTypeMap 4)

  2. With the map construct in step 1, the next time parsing record DIE, we can check using the unique name (+ decl location and size if it’s definition DIE) if we have already created record decl and Type for the same user-defined type. If so, we just add this DIE to the set of DIEs representing the same type. If the DIE is a definition, record it so we don’t need to search for definition DIE again later.

  3. With that, when SymbolFileDWARF is asked to complete a type, it will starts to search for definition DIE if we haven’t parsed it yet to complete it. That means when we are parsing a declaration DIE, we won’t be able to inform clang that this decl’s definition starts 5, I’m not sure if we delay this to SymbolFileDWARF::CompleteType, will SymbolFileDWARF::CompleteType even been called to complete the type?

CC @Michael137 @labath @dblaikie @adrian.prantl

To step back.
From what I remember this was originally introduced because of fundamental limitation of DWP because of 32bit addressing.
For DWARF6 I think proposal was accepted to allow 64 bit support.
Maybe we should revisit enabling this early in LLVM tools?
This will solve another issue where we have to re-construct cu-index.

How does this relate to 32 vs 64 bits addressing in DWARF?

Unless I am miss remembering -gsimple-template-names was introduced to reduce .debug_str.dwo size in a DWP file because it was overflowing due to 32bit limitation in .debug_str_offsets.dwo

I think -gsimple-template-names is good in general in terms of saving debug info size even if we haven’t encountered overflowing issues.

Delaying definition DIE searching also helps non-template types.

1 Like

CC @clayborg

I had similar questions w.r.t. FindDefinitionTypeForDWARFDeclContext back when we were investigating [RFC] Improve Dwarf 5 .debug_names Type Lookup/Parsing Speed - #17 by Felipe which we narrowed down to the simple-template-name support as well. But the combination of Greg’s TypeQuery API and Felipe’s DW_IDX_parent fixed the performance issues we were seeing (partly because we now aren’t doing the extra simple-template-name lookups anymore on Darwin).

Haven’t looked into your proposal in detail yet, but one problem with the lazy completion mechanism of records is that we don’t consistently complete the definitions in time that Clang needs them, and those cases have been quite hard to debug/fix in the past (in fact we’ve been experimenting with completing definitions more aggressively than we currently are). So we should be mindful of that when adding more complexity to the completion logic. Do you have any idea how much this would speed things up (with @dblaikie’s patch applied)?

Curiosity got the better of me so I went back in time to see when this logic was introduced (might not be of much use since LLDB changed a lot since then):

Thanks for the reply.

I think the change of delaying definition DIE searching only affects SymbolFileDWARF plugin internally, only searching for the definition DIE if it tries to complete the Type created from a declaration DIE, and should be invisible outside it (it doesn’t delay type completion). I’m currently working on a prototype of it. I’m not familiar how/when clang will ask lldb to complete a type. I assume those should work as before.

For me, the main question is what’s the impact of not calling this piece of code.

We call that when constructing a forward declaration of a class if we have found a matching definition. If we don’t look for a definition, we won’t know whether it exists, and so we won’t cant call TypeSystemClang::StartTagDeclarationDefinition (I don’t think calling it unconditionally is an option, as StartTagDeclarationDefinition commits us to later finishing the definition.)

I don’t know what’s the impact of not calling that function. Maybe it’s fine. Maybe it means some code needs to change to operate under different assumptions. Maybe the only way to find out is to try doing it…

This is the approach I was going to implement as a fix, so the solution described in first message of this thread is solid and will work well.

To re-iterate the solution I was thinking of to make sure they are the same:

  • Always create the type using declarations DIEs or the actual DIE so that we can avoid expensive searches for the real definition. Keep turning on the external AST ability to complete this type so when and if we need a full definition, we will call the SymbolFileDWARF::CompleteType() when we need it completed
  • In SymbolFileDWARF::CompleteType(), we will do the search for the full definition and complete the type if we find one.

This should avoid the infinite loops as we will construct the class definition without doing the expensive type searches. The only downside is we now will be always creating types that claim they can complete themselves via the external AST support, so if we have a lot of forward declaration only types, we can end up doing more work.

We might be able to fix this by checking if the accelarator tables have any entries for this exact type and only enabling the external AST support if we do find some matching entries. So we don’t do a full type search, we just ask if the accelerator tables if they have something for this type before enabling the completion stuff.

Yes, that’s also what I have in mind basically.

The only downside is we now will be always creating types that claim they can complete themselves via the external AST support, so if we have a lot of forward declaration only types, we can end up doing more work.

I don’t quite get this. Is it because with this change, the types created from declaration DIEs will also claim to have external storage? With external storage being set, Clang will ask lldb to complete those types. If there’s no definition DIEs for them, we just force type completion as we are currently doing.

Sent a prototype here: [lldb][DWARF] Delay struct/class/union definition DIE searching when parsing declaration DIEs. by ZequanWu · Pull Request #90663 · llvm/llvm-project · GitHub

The only downside is we now will be always creating types that claim they can complete themselves via the external AST support, so if we have a lot of forward declaration only types, we can end up doing more work.

I don’t quite get this. Is it because with this change, the types created from declaration DIEs will also claim to have external storage? With external storage being set, Clang will ask lldb to complete those types. If there’s no definition DIEs for them, we just force type completion as we are currently doing.

Actually right now if we only have a forward declaration (if we search for a real definition and don’t find one), we don’t mark it as being able to be completed via external AST and these types end up just being forward declarations that will never be completed. I don’t believe those types end up doing a “start definition” followed by an “end definition”. We only do that for types that require completion, like if “class A” inherits from “class B” and there is no definition for “class B”, we must complete class B and make it empty so clang doesn’t assert and kill the program, for are considered “forcefully completed types”.

Yeah, I get this.

In my prototype above, I change it so that we just always mark types as being able to be completed via external AST with m_ast.SetHasExternalStorage(clang_type.GetOpaqueQualType(), true);. When the complete type is required, it searches for definition DIE, if failed, SymbolFileDWARF::CompleteType will be no-op and return false. lldb won’t start definition for it. Is setting types created from declaration DIEs as being able to be completed with external AST causing slowness?

Here’s an example of binary built without -gsimple-template-names which could also benefit from this change:

main.cpp

namespace N1 {
  struct S;
}
template<typename T>
struct Foo {};

Foo<N1::S> foo;
int main() {

}

namespace.cpp

namespace N1 {
struct S {};
}

N1::S s;

Compile with clang++ -g main.cpp namespace.cpp and test with lldb a.out -o "log enable dwarf comp" -o "b main" -o "r" -o "p foo" -o "p s"

Current lldb logging:

(lldb) p foo
lldb             (x86_64) /tmp/a.out: DWARFASTParserClang::ParseTypeFromDWARF (die = 0x0000000000000040, decl_ctx = 0x000055F3D2C378A0 (die 0x000000000000000c)) DW_TAG_subprogram name = 'main')
lldb             (x86_64) /tmp/a.out: DWARFASTParserClang::ParseTypeFromDWARF (die = 0x000000000000004f, decl_ctx = 0x000055F3D2C378A0 (die 0x000000000000000c)) DW_TAG_base_type name = 'int')
lldb             (x86_64) /tmp/a.out: DWARFASTParserClang::ParseTypeFromDWARF (die = 0x000000000000002e, decl_ctx = 0x000055F3D2C378A0 (die 0x000000000000000c)) DW_TAG_structure_type name = 'Foo<N1::S>')
lldb             (x86_64) /tmp/a.out: DWARFASTParserClang::ParseTypeFromDWARF (die = 0x000000000000003d, decl_ctx = 0x000055F3D2C381E8 (die 0x000000000000003b)) DW_TAG_structure_type name = 'S')
lldb             (x86_64) /tmp/a.out: SymbolFileDWARF(0x000055F3D2C215E0) - 0x000000000000003d: DW_TAG_structure_type type "S" is a forward declaration, trying to find complete type
lldb             (x86_64) /tmp/a.out: SymbolFileDWARF::FindDefinitionTypeForDWARFDeclContext(tag=DW_TAG_structure_type, name='S')
lldb             (x86_64) /tmp/a.out: SymbolFileDWARF::FindDefinitionTypeForDWARFDeclContext(tag=DW_TAG_structure_type, name='S') trying die=0x000000000000007f (N1::S)
lldb             (x86_64) /tmp/a.out: DWARFASTParserClang::ParseTypeFromDWARF (die = 0x000000000000007f, decl_ctx = 0x000055F3D2C381E8 (die 0x000000000000007d)) DW_TAG_structure_type name = 'S')
lldb             (x86_64) /tmp/a.out: SymbolFileDWARF(0x000055F3D2C215E0) - 0x000000000000003d: DW_TAG_structure_type type "S" is a forward declaration, complete type is 0x0000007f
lldb             (x86_64) /tmp/a.out: 0x0000002e: DW_TAG_structure_type 'Foo<N1::S>' resolving forward declaration...
(Foo<N1::S>)  {}
(lldb) p s
(N1::S)  {}

Notice that it’s trying to find definition DIE for N1::S and create type from it when printing foo whose type is Foo<N1::S>. But it’s not actually being used at this moment. This is extra unnecessary work.
Snippet of the debug info:

0x0000007d:   DW_TAG_namespace
                DW_AT_name      ("N1")

0x0000007f:     DW_TAG_structure_type
                  DW_AT_calling_convention      (DW_CC_pass_by_value)
                  DW_AT_name    ("S")
                  DW_AT_byte_size       (0x01)
                  DW_AT_decl_file       ("/tmp/namespace.cpp")
                  DW_AT_decl_line       (2)

With this change, lldb only search for definition DIE when p s and that’s the time when we want to complete the type N1::S and searching for its definition DIE:

(lldb) p foo
lldb             (x86_64) /tmp/a.out: DWARFASTParserClang::ParseTypeFromDWARF (die = 0x0000000000000040, decl_ctx = 0x000055632AA21570 (die 0x000000000000000c)) DW_TAG_subprogram name = 'main')
lldb             (x86_64) /tmp/a.out: DWARFASTParserClang::ParseTypeFromDWARF (die = 0x000000000000004f, decl_ctx = 0x000055632AA21570 (die 0x000000000000000c)) DW_TAG_base_type name = 'int')
lldb             (x86_64) /tmp/a.out: DWARFASTParserClang::ParseTypeFromDWARF (die = 0x000000000000002e, decl_ctx = 0x000055632AA21570 (die 0x000000000000000c)) DW_TAG_structure_type name = 'Foo<N1::S>')
lldb             (x86_64) /tmp/a.out: DWARFASTParserClang::ParseTypeFromDWARF (die = 0x000000000000003d, decl_ctx = 0x000055632AA21EB8 (die 0x000000000000003b)) DW_TAG_structure_type name = 'S')
lldb             (x86_64) /tmp/a.out: 0x0000002e: DW_TAG_structure_type 'Foo<N1::S>' resolving forward declaration...
(Foo<N1::S>)  {}
(lldb) p s
lldb             (x86_64) /tmp/a.out: DWARFASTParserClang::ParseTypeFromDWARF (die = 0x000000000000007f, decl_ctx = 0x000055632AA21EB8 (die 0x000000000000007d)) DW_TAG_structure_type name = 'S')
lldb             (x86_64) /tmp/a.out: 0x0000007f: DW_TAG_structure_type 'S' resolving forward declaration...
(N1::S)  {}