>
> Hey all,
>
> When running lldb tests with clang, on non-Apple platforms, we really
need this flag set:
> -fstandalone-debug
>
> We think this is a bug in LLDB, and would *really* like to see it fixed.
> gdb works fine without this flag, and it *drastically* reduces the
amount of debug info stored in object files. My understanding is that the
debug info that LLDB needs is there, but it's in a different compilation
unit.
There are indeed some cases where this is true and LLDB does need to get
better about being able to find full definitions of classes that aren't
fully defined in some places, but it can also mean some class definitions
might never be available. Also, if we get one shared library that has a
definition for class A which inherits from B and we try to write an
expression that uses a version of A from another shared library, and one
library has a full definition and another doesn't, then we run into
problems. The main issue is
>
> Usually the missing information is type information about a class with a
vtable where the first virtual method (aka the key function) is defined in
a different TU. That TU will emit the vtable and all type information for
that class.
This isn't always true. We have kernel sources here at Apple where header
files define base classes, but these base classes are never compiled in
their own TU. This means we end up with just a forward declaration for this
other class and we never get it fully defined. I consider this a compiler
bug when I must be able to find debug info for shared library "B" in order
to be able debug shared library "A".
Is gdb able to debug in this situation, so long as you don't look inside
the base class? I suppose this gets to the core issue, which is that LLDB
is trying to build a full-fledged Clang AST, which requires a definition of
the base class, and now we've come full circle: either the compiler needs
to accept an AST with less information, or it needs to produce all the
debug info it will need. Definitely worth thinking about. It might be
easier to solve this by accepting incomplete base class types on the clang
side.
> That said, it's fine if LLDB has to add the flag as a short-term way to
stabilize the test suite. I just want to make sure we're on the same page
here: this is probably an LLDB bug, not a Clang bug.
I agree there are bugs in LLDB. I also don't like the compiler just
assuming debug info will be available from somewhere else. It isn't always
available, and we do have code that proves that here at Apple and this is
the reason the flag defaults to enable the -fstandalone-debug.
I believe the real solution is to do proper type uniquing in the compiler
and linker. This is definitely a hard problem to solve correctly without
increasing .o file size, but it is an effort I do believe is well worth it.
A few possible solutions:
1 - if the compiler uses precompiled headers, emit all types from the
precompiled headers with their full definitions once into standalone PCH
DWARF file. Then refer to these full types in this external file with new
DWARF tags. This keeps the .o files small, and would keep the types
properly organized inside a DW_TAG_compile_unit for each header file. When
the linker links the final executable and is going to make the DWARF for
the linked executable, it will copy the DWARF for the precompiled header
over into the final binary, and then link each .o file and fix up any
external type references to the types in the PCH DWARF. Then we get full
debug info, no type duplication what so ever. The downside is this requires
PCH.
It's an interesting idea. I wonder if MSVC's PCH step adds type info to
the PDB. Unfortunately, PCH is kind of a non-starter for Google's usage.
2 - emit full types for everything in the .o files and then have a quick
way to unique them. DWARF has a .debug_types section that does this using
special ELF sections that the linker knows how to unique, but I don't
really like the DWARF output for this as it spreads everything into
separate sections. This also makes the .o files much larger as they all
carry full type info.
Yep, I believe we currently do this at Google to reduce the size of the
final linked image. I don't know which flags control it. But we also need
-fno-standalone-debug because if the object files are too large, the link
step will overflow the memory quota and die. That's why this is important
to us: without this cleverness, programs actually fail to link.
In theory, smaller object files and faster links are useful to all the
other consumers of Clang, so all this work has been upstream and
on-by-default, but obviously it isn't working for LLDB.
Anyway, we should try to figure something out. I understand if you're not
interested in pursuing this work, I just hope that patches to make LLDB
smarter about this are welcome, and that we can help out as necessary on
the Clang side.