clang CL breaks LLDB std::string printing

Hi David,

There are some LLDB tests that have been failing against clang-3.6 for a long time and just started failing in clang-3.5 when my Ubuntu distro updated clang-3.5.

I tracked it back to a clang CL that you submitted nearly a year ago.

This test passes when compiling with gcc 4.8.2 and clang-3.5 before this CL. I’m very new to the project and I don’t really understand what’s going on here. Any guidance you can offer would be very much appreciated.

Thanks,

Vince

CodeGenDebugInfo.diff (6.88 KB)

Which tests?

TestDataFormatterStdString.py StdStringDataFormatterTestCase.test_with_dwarf_and_run_command

And at least a couple more of the following (maybe all):

TestCallStdStringFunction.py
TestDataFormatterSkipSummary.py

TestDataFormatterStdIterator.py
TestDataFormatterStdList.py
TestDataFormatterStdString.py
TestSBValuePersist.py
TestStringPrinter.py
TestTypeCompletion.py

Hi David,

There are some LLDB tests that have been failing against clang-3.6 for a
long time and just started failing in clang-3.5 when my Ubuntu distro
updated clang-3.5.

I tracked it back to a clang CL that you submitted nearly a year ago.

This test passes when compiling with gcc 4.8.2 and clang-3.5 before this
CL. I'm very new to the project and I don't really understand what's going
on here. Any guidance you can offer would be very much appreciated.

Short answer is that you're probably missing the debug build of your
standard library (libstdc++-XXX-dbg).

Long answer: Compilers (both GCC & Clang) try to optimize debug info size
by relying on the existence of debug info (especially for types) in other
files. They use various signals to make this assumption - both Clang and
GCC use vtables as one signal (if a type is dynamic, only emit the debug
info for the definition of the type wherever the vtable is emitted). Beyond
that, Clang also uses explicit template instantiation declarations as a
signal as well (if there's an explicit template instantiation declaration,
only emit the full definition of the type where the explicit instantiation
definition is).

This allows the compilers to omit definitions in many cases, in favor of
them being emitted in one or a small handful of places, reducing debug info
size and linker input size.

In the case of std::basic_string<char>, libstdc++ (& other standard library
implementations, I'd imagine) has an explicit instantiation declaration to
avoid the compiler having to do all the work of instantiating
basic_string<char> in every translation unit. The explicit instantiation
definition is in the standard library objects (static or dynamic) and
that's where the debug info for the type is. If you don't install the debug
build of your standard library, you won't have the debug info definition of
std::basic_string<char>.

Hope that helps,

- David

I really should write a blog post about this...

Also here’s a related excerpt from the clang man page:

-fstandalone-debug -fno-standalone-debug
Clang supports a number of optimizations to reduce the size of debug information in the binary. They work based on the assumption that
the debug type information can be spread out over multiple compilation units. For instance, Clang will not emit type definitions for
types that are not needed by a module and could be replaced with a forward declaration. Further, Clang will only emit type info for a
dynamic C++ class in the module that contains the vtable for the class.

The -fstandalone-debug option turns off these optimizations. This is useful when working with 3rd-party libraries that don’t come with
debug information. This is the default on Darwin. Note that Clang will never emit type information for types that are not referenced at
all by the program.

– adrian

Note that this is also the default on FreeBSD. This might be an
important point when comparing test results on FreeBSD and Linux since
they otherwise share a lot of attributes.

-Ed

+1

I'd like to add that simply installing libstdc++-dbg will not solve this
problem currently as lldb has troubles loading symbols from splitdebug
files (I am working on this currently, <http://reviews.llvm.org/D7913>
should help, but apparently is not sufficient). OTOH, adding
-fstandalone-debug to C(XX)FLAGS does make the problem go away, and it
could be something we might want to enable by default in test cases (to
reduce dependencies on the test environment). We could then probably
disable it when testing splitdebug handling specifically.

pl

I’d like to add that simply installing libstdc+±dbg will not solve this problem currently as lldb has troubles loading symbols from splitdebug files (I am working on this currently, <http://reviews.llvm.org/D7913> should help, but apparently is not sufficient).

I couldn’t get it working this way either. Thanks for looking into this Pavel!

OTOH, adding -fstandalone-debug to C(XX)FLAGS does make the problem go away, and it could be something we might want to enable by default in test cases (to reduce dependencies on the test environment).

In most cases, I strongly prefer easy-of-use to completely optimal. Even if we fix splitdebug files, there is still the possibility of someone not having debug symbols readily available or third party libraries without symbols that we would like to support. I recommend that we follow OSX and FreeBSD and enable -fstandalone-debug by default on Linux and Android.

People who really know what they’re doing can experiment with the -fno-standalone-debug and use it if it works for them.

Thoughts?

Vince

I recommend telling the compiler to not do this using (set -no-fstandalone-debug in OTHER_CFLAGS) if you want to debug with LLDB. If you don't you can DWARF for the following code:

class A : public B
{
};

Where the compiler says "no one used 'B' so I am going to emit a forward declaration for 'B'. Then LLDB tries to make class 'A' in a clang AST context and then we try to parse 'B' so we can have 'A' inherit from it and clang of course would assert and kill LLDB (if we actually try to give clang a class 'B' that is a forward declaration as a base class) so LLDB has to lie to keep clang from asserting and just say that 'B' class that contains nothing.

The idea was that someone will certainly declare 'B' somewhere in your current source base. This mostly holds true, but if you have a header file from a shared library that has a C++ class that people might inherit from (like we do in Darwin Kernel Extensions), we end up with a class we use for debugging that isn't allowed to see any ivars from "B", nor call any functions declared inside 'B' or any of its subclasses (because we told clang 'B' has no contents when creating the type in the clang AST. So we default to -no-fstandalone-debug for all of Darwin to avoid this.

Greg Clayton

Hi David,

There are some LLDB tests that have been failing against clang-3.6 for a
long time and just started failing in clang-3.5 when my Ubuntu distro
updated clang-3.5.

I tracked it back to a clang CL that you submitted nearly a year ago.

This test passes when compiling with gcc 4.8.2 and clang-3.5 before this
CL. I'm very new to the project and I don't really understand what's going
on here. Any guidance you can offer would be very much appreciated.

Short answer is that you're probably missing the debug build of your
standard library (libstdc++-XXX-dbg).

Long answer: Compilers (both GCC & Clang) try to optimize debug info size
by relying on the existence of debug info (especially for types) in other
files. They use various signals to make this assumption - both Clang and
GCC use vtables as one signal (if a type is dynamic, only emit the debug
info for the definition of the type wherever the vtable is emitted). Beyond
that, Clang also uses explicit template instantiation declarations as a
signal as well (if there's an explicit template instantiation declaration,
only emit the full definition of the type where the explicit instantiation
definition is).

This allows the compilers to omit definitions in many cases, in favor of
them being emitted in one or a small handful of places, reducing debug info
size and linker input size.

In the case of std::basic_string<char>, libstdc++ (& other standard
library implementations, I'd imagine) has an explicit instantiation
declaration to avoid the compiler having to do all the work of
instantiating basic_string<char> in every translation unit. The explicit
instantiation definition is in the standard library objects (static or
dynamic) and that's where the debug info for the type is. If you don't
install the debug build of your standard library, you won't have the debug
info definition of std::basic_string<char>.

Hope that helps,

I'd like to add that simply installing libstdc++-dbg will not solve this
problem currently as lldb has troubles loading symbols from splitdebug files

What do you mean by splitdebug files? If you mean -gsplit-dwarf/Fission I'm
not sure why that would be relevant - libstdc++-dbg isn't built with
Fission so far as I know.

(I am working on this currently, <http://reviews.llvm.org/D7913> should
help, but apparently is not sufficient).

But, yes, there are bugs in LLDB specifically related to looking for
definitions of certain types in certain contexts. The one I recall was hit
when I first implemented this in clang was that LLDB expected a base class
to have a definition in the CU it was referenced as a base class. The
optimization caused that not to be the case - you could have a base class
that was only declared, but defined in some other CU. LLDB was just failing
to search other CUs for the definition as it is (presumably) already doing
for normal declarations (one file has a "struct foo; foo *f;" and some
other file has "struct foo { ... };" - when debugging the first file you
reasonably expect to be able to, say, "p f->bar").

OTOH, adding -fstandalone-debug to C(XX)FLAGS does make the problem go
away, and it could be something we might want to enable by default in test
cases (to reduce dependencies on the test environment).

I'm of several minds about test case design for debuggers/debug info.

LLVM takes a pretty hard line about writing isolated test cases, the
equivalent to this for a debugger would be hardcoded DWARF (in assembly
files, or possibly pre-compiled binaries with the source available). GDB
has some tests like this, though they're hard to maintain, especially with
Clang, due to the reordering of top level asm (hard to create the labels
necessary to describe ranges when Clang might move the asm around).

The next most robust testing would be to at least make test cases
standalone - don't include the platform's standard library, etc - have each
test case define all the relevant constructs so there's no dependence on
the platform's library implementation, ensuring tests are (more) portable.

I recommend telling the compiler to not do this using (set
-no-fstandalone-debug in OTHER_CFLAGS) if you want to debug with LLDB. If
you don't you can DWARF for the following code:

class A : public B
{
};

Where the compiler says "no one used 'B' so I am going to emit a forward
declaration for 'B'.

That's not quite the logic used here. Certainly if you use A then you've
used B. B may be emitted as a declaration if we know that some other CU
will contain the definition (eg: using the vtable optimization - if B has
virtual functions, we'll only emit the definition of B where the vtable
goes (if B has a key function, this means the debug info definition of B
goes in exactly one place: where the definition of the key function
appears))

Then LLDB tries to make class 'A' in a clang AST context and then we try
to parse 'B' so we can have 'A' inherit from it and clang of course would
assert and kill LLDB (if we actually try to give clang a class 'B' that is
a forward declaration as a base class) so LLDB has to lie to keep clang
from asserting and just say that 'B' class that contains nothing.

Shouldn't it go & find B in another file? The same way you would find the
definition of foo if one file contained "struct foo; foo *f;" and the user
executed the command "p f->bar"?

The idea was that someone will certainly declare 'B' somewhere in your
current source base.

It's more robust than that - the vtable optimization, assuming you compile
your whole program with debug info, is guaranteed by the C++ standard (odr
use of the class means you have a definition of all the virtual members
/somewhere/ in your program). GCC has been doing this optimization for a
while now, Clang was doing similar optimizations ("struct foo { }; foo *f;"
- we'd only emit the declaration of 'foo' since it was never dereferenced)
for a while too - I believe Eric implemented the first versions of that on
a suggestion from Chris Lattner, FWIW.

This mostly holds true, but if you have a header file from a shared
library that has a C++ class that people might inherit from (like we do in
Darwin Kernel Extensions),

If you don't have debug info for that shared library (hence the suggestion
to install debug info for the standard library - there are packages for
it). Granted I imagine it'll take some finagling to change the Darwin
Kernel Extensions build system to build a partial debug info package
(presumably you don't want to ship all the debug info for the
implementation of that library - for size and privacy reasons).

I recommend telling the compiler to not do this using (set -no-fstandalone-debug in OTHER_CFLAGS) if you want to debug with LLDB. If you don't you can DWARF for the following code:

class A : public B
{
};

Where the compiler says "no one used 'B' so I am going to emit a forward declaration for 'B'.

That's not quite the logic used here. Certainly if you use A then you've used B. B may be emitted as a declaration if we know that some other CU will contain the definition (eg: using the vtable optimization - if B has virtual functions, we'll only emit the definition of B where the vtable goes (if B has a key function, this means the debug info definition of B goes in exactly one place: where the definition of the key function appears))

Sure, then only if B has a vtable this will happen.

Then LLDB tries to make class 'A' in a clang AST context and then we try to parse 'B' so we can have 'A' inherit from it and clang of course would assert and kill LLDB (if we actually try to give clang a class 'B' that is a forward declaration as a base class) so LLDB has to lie to keep clang from asserting and just say that 'B' class that contains nothing.

Shouldn't it go & find B in another file? The same way you would find the definition of foo if one file contained "struct foo; foo *f;" and the user executed the command "p f->bar"?

LLDB is designed to parse each shared library individually so that we can re-use shared libraries from between multiple runs. If liba.so doesn't changed, we don't need to re-parse anything in liba.so or its debug info file. If you recompile libb.so, we don't have any dependencies in liba.so on types like 'B' from elsewhere otherwise the types passed out by each shared library have dependencies and have to throw everything away when any dependent shared library is loaded.

But in LLDB, when we have a process, _will_ find the real definition elsewhere when displaying things to you for things that can legally be forward declarations in sources (pointers and references). So we will have a variable of type "A" from liba.so and we will be displaying it and if it has an instance variable, whose type is "C *" and that 'C' is a forward declaration, we will search and find "C" elsewhere (in libb.so) for display purposed by switching over to using the type from 'libb.so'. But we can't do this for things that can't be forward declarations in sources, like base classes.

The idea was that someone will certainly declare 'B' somewhere in your current source base.

It's more robust than that - the vtable optimization, assuming you compile your whole program with debug info, is guaranteed by the C++ standard (odr use of the class means you have a definition of all the virtual members /somewhere/ in your program).

Yes, but this doesn't guarantee that this debug info is in the current shared library or executable. So if the other shared library doesn't have debug info, then you end up missing debug info. LLDB parses shared libraries individually because it can have multiple debug sessions running simultaneously, each using different combinations of shared libraries at the same time. So we can't take type 'B' from libb.so and copy it into type 'A' from liba.so because if libb.so changes and we restart a debug session, we would have to throw away all type info (or maintain dependency lists). Further we can't do it because process 123 might be still be using the old version of libb.so while process 124 restarted and is using the new version.

GCC has been doing this optimization for a while now, Clang was doing similar optimizations ("struct foo { }; foo *f;" - we'd only emit the declaration of 'foo' since it was never dereferenced) for a while too - I believe Eric implemented the first versions of that on a suggestion from Chris Lattner, FWIW.

Again, for places that can have forward declarations (for pointers and references), this is quite OK. Our variable display logic will find the right definition. It is just in the base class case where this falls down for LLDB.

This mostly holds true, but if you have a header file from a shared library that has a C++ class that people might inherit from (like we do in Darwin Kernel Extensions),

If you don't have debug info for that shared library (hence the suggestion to install debug info for the standard library - there are packages for it). Granted I imagine it'll take some finagling to change the Darwin Kernel Extensions build system to build a partial debug info package (presumably you don't want to ship all the debug info for the implementation of that library - for size and privacy reasons).

We do have this issue and we won't be changing the default on MacOSX for this very reason.

The performance and memory benefits we get in LLDB from having each shared library be independent are huge. Re-starting a debug session in Xcode for the same program (rerunning) is instant as we have a global cache of shared libraries that are all stand alone and have no dependencies. When we still used GDB, every restart was a new delay as it loaded all shared libraries and debug info for each run. Each debug session was a unique new instance of GDB running in a separate process so that GDB could hack its shared libraries up any way it chose to with no worries of dependencies because there was only 1 debug session going on at a time. If a shared library doesn't change in LLDB, we still have a live copy all pre-parsed and ready to go in our global cache.

>
>
>
> I recommend telling the compiler to not do this using (set
-no-fstandalone-debug in OTHER_CFLAGS) if you want to debug with LLDB. If
you don't you can DWARF for the following code:
>
>
> class A : public B
> {
> };
>
> Where the compiler says "no one used 'B' so I am going to emit a forward
declaration for 'B'.
>
> That's not quite the logic used here. Certainly if you use A then you've
used B. B may be emitted as a declaration if we know that some other CU
will contain the definition (eg: using the vtable optimization - if B has
virtual functions, we'll only emit the definition of B where the vtable
goes (if B has a key function, this means the debug info definition of B
goes in exactly one place: where the definition of the key function
appears))

Sure, then only if B has a vtable this will happen.

(or an explicit instantiation declaration (such as std::string) - or is
only used in ways that don't require the complete type)

>
> Then LLDB tries to make class 'A' in a clang AST context and then we try
to parse 'B' so we can have 'A' inherit from it and clang of course would
assert and kill LLDB (if we actually try to give clang a class 'B' that is
a forward declaration as a base class) so LLDB has to lie to keep clang
from asserting and just say that 'B' class that contains nothing.
>
> Shouldn't it go & find B in another file? The same way you would find
the definition of foo if one file contained "struct foo; foo *f;" and the
user executed the command "p f->bar"?

LLDB is designed to parse each shared library individually so that we can
re-use shared libraries from between multiple runs. If liba.so doesn't
changed, we don't need to re-parse anything in liba.so or its debug info
file. If you recompile libb.so, we don't have any dependencies in liba.so
on types like 'B' from elsewhere otherwise the types passed out by each
shared library have dependencies and have to throw everything away when any
dependent shared library is loaded.

Ah, thanks for the architectural explanation - this is a detail I hadn't
read about/understood before.

But in LLDB, when we have a process, _will_ find the real definition
elsewhere when displaying things to you for things that can legally be
forward declarations in sources (pointers and references). So we will have
a variable of type "A" from liba.so and we will be displaying it and if it
has an instance variable, whose type is "C *" and that 'C' is a forward
declaration, we will search and find "C" elsewhere (in libb.so) for display
purposed by switching over to using the type from 'libb.so'. But we can't
do this for things that can't be forward declarations in sources, like base
classes.

OK, so you switch contexts to find the type (knowing where it is - with
some index, etc).

It seems to me that to support this kind of optimized debug info with
LLDB's architecture similar things could be done during AST building time -
switching library context to find the definition elsewhere. The extra
wrinkle being you'd want to keep track of the dependent libraries used in
this case - and, yes, it'd come with some performance cost - if you had to
reload the dependent library, you'd reload any libraries that depended upon
it (or possibly only the specific types that were dependent)

I mean that's up to you guys/whoever's working on the project if it's
important - I get that Google's needs for small debug info are somewhat
more dire than other projects, but even before I started on debug info it
seemed like size was a continuous concern/priority - so I'm still somewhat
surprised/confused by the apparent priorities here, but we have a flag &
the choice is easy enough to make.

I imagine at some point, the Googlers working on LLDB will ultimately
want/need to implement this to keep our debug info size improvements on
while being usable with LLDB, but I don't actually know their
plans/goals/tradeoffs. (also, we tend to build everything statically, as I
understand it - so shared library features aren't as valuable for our use
cases)

- David

If you don’t have debug info for that shared library (hence the suggestion to install debug info for the standard library - there are packages for it).

Not if you’re using a random Android device from vendor foo. =)

I imagine at some point, the Googlers working on LLDB will ultimately want/need to implement this to keep our debug info size improvements on while being usable with LLDB, but I don’t actually know their plans/goals/tradeoffs.

It will be a long time before we get there. Until then, I’d like to have -fstandalone-debug default for all targets. I also think it’s unintuitive to have different platforms silently behave differently for completely cross platform concepts.

also, we tend to build everything statically, as I understand it - so shared library features aren’t as valuable for our use cases

I’m working on tools for third-party (and Google) Android developers, who have very similar priorities as independent Linux developers, e.g. error-proof/easy-to-use is more important than smallest possible debug symbols.

Sincerely,

Vince

I mean the debug info which is stored in separate files, located using the
".gnu.debuglink" section. This is how the debug file from the libstdc++-dbg
package is located. I don't know what is the official name, but it is
described here <
https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html>.
But that is not so important, I believe I will have it running shortly.

pl