lldb crash while debugging wxWidgets application that was built with g++

Hi,

While debugging a real world application (codelite) I placed a breakpoint in the ‘OnAbout’ function
attempting to view any of the local variables resulted in crash, see below:

Process 24146 stopped

  • thread #1: tid = 24146, 0x00000000006667f6 codeliteclMainFrame::OnAbout(this=0x000000000215f5c0, (null)=0x00007fffa6fd2640) + 66 at frame.cpp:1794, name = 'codelite', stop reason = step over frame #0: 0x00000000006667f6 codeliteclMainFrame::OnAbout(this=0x000000000215f5c0, (null)=0x00007fffa6fd2640) + 66 at frame.cpp:1794
    1791 wxString mainTitle;
    1792 mainTitle = CODELITE_VERSION_STR;
    1793
    → 1794 AboutDlg dlg(this, mainTitle);
    1795 dlg.SetInfo(mainTitle);
    1796 dlg.ShowModal();
    1797 }
    (lldb) p dlg
    error: libwx_gtk2u_unofficial_core-3.0.so.0 DWARF DIE at 0x030ac4cc for class ‘wxSizer’ has a base class ‘wxClientDataContainer’ that is a forward declaration, not a complete definition.
    Please file a bug against the compiler and include the preprocessed output for /home/david/devel/packages/wx/3.0-2/wxwidgets3.0-3.0.0/objs_gtk_sh/…/src/common/sizer.cpp
    Segmentation fault

From the segfault message, I understand that this is a bug with gcc
Still, is there a way to suppress this error by telling lldb to silently ignore this? ( I prefer it to display nothing instead of crashing and taking down codelite with it )

Thanks

I don’t think we should be segfaulting… would you be able to attach a debugger to your running codelite, and see where the segfault is being triggered?
That would be a good start of an investigation of this issue

- Enrico
:envelope_with_arrow: egranata@.com :phone: 27683

This is a serious problem with the debug info that GCC and Clang recently started to emit to try and save space by omitting important debug info.

The problem is:

class A : public B
{
}

The debug info for "A" is complete, but since no one used stuff from class "B" they decided to just forward declare "B". Why is this a problem? Because we are trying to reconstruct a class definition of "A" using incomplete information.

There are two things that can fix this:
1 - Modify the DWARF parser to look for a complete version of "B" elsewhere in the current executable's debug info (but there might not be one).
2 - Change flags to GCC to have it not elide this debug info (don't know what these flags would be, you will need to find out if there is such a flag).
3 - Just start and complete the definition for "B" and pretend it is a class that contains nothing

Solution #1 is might alleviate some of the problems, but often will result in a failure when there is no complete definition of "B".

Solution #2 is the best solution, but this doesn't mean that people won't run into this crasher when debugging random code.

Solution #3 is dangerous because you might have foo.cpp whose debug info has complete definitions for A and B, and bar.cpp that has a complete definition for A but not for B. Then you write and expression that uses an instance of "A" from foo.cpp and uses it with an instance of "A" from bar.cpp and the expression parser will now complain that is has two competing definitions for class "A" that don't match.

Solution #1 should probably be tried first. If you can send me the executable you were debugging with debug info inside it, I can look and see if solution #1 will fix your current problem. We should avoid crashing, that is for sure.

Greg

This is a serious problem with the debug info that GCC and Clang recently
started to emit to try and save space by omitting important debug info.

The problem is:

class A : public B
{
}

The debug info for "A" is complete, but since no one used stuff from class
"B" they decided to just forward declare "B". Why is this a problem?
Because we are trying to reconstruct a class definition of "A" using
incomplete information.

There are two things that can fix this:
1 - Modify the DWARF parser to look for a complete version of "B"
elsewhere in the current executable's debug info (but there might not be
one).
2 - Change flags to GCC to have it not elide this debug info (don't know
what these flags would be, you will need to find out if there is such a
flag).
3 - Just start and complete the definition for "B" and pretend it is a
class that contains nothing

Solution #1 is might alleviate some of the problems, but often will result
in a failure when there is no complete definition of "B".

Solution #2 is the best solution, but this doesn't mean that people won't
run into this crasher when debugging random code.

Solution #3 is dangerous because you might have foo.cpp whose debug info
has complete definitions for A and B, and bar.cpp that has a complete
definition for A but not for B. Then you write and expression that uses an
instance of "A" from foo.cpp and uses it with an instance of "A" from
bar.cpp and the expression parser will now complain that is has two
competing definitions for class "A" that don't match.

Solution #1 should probably be tried first. If you can send me the
executable you were debugging with debug info inside it, I can look and see
if solution #1 will fix your current problem. We should avoid crashing,
that is for sure.

This might be a slight problem since I was trying to debug codelite.

With debug symbols it can go up to 200MB and it consists of way too many
shared libraries
Let me first to try and narrowing it to something minimalistic

Greg

You can check this yourself by searching for "wxClientDataContainer" in an command line LLDB session that has your debug symbols loaded for codelight:

% lldb /path/to/codelight
(lldb) image lookup --type wxClientDataContainer

See if you see any complete definitions that are printed out. If so, then solution #1 will probably get around this issue for the moment. But we should still not crash even if we do look for a complete "wxClientDataContainer" definition and don't find one (also part of the fix for solution #1).

Greg

In the meanwhile, here is the crash backtrace when it crashe ( the scenario is: gdb is debugging lldb which debugs a debug version of codelite :wink: )
http://pastebin.com/PVcZeQFk

HTH

I tried your suggestion:

Process 17807 stopped

  • thread #1: tid = 17807, 0x00000000006666b8 codeliteclMainFrame::OnAbout(this=0x0000000002ef0d80, (null)=0x00007fff2d666cd0) + 26 at frame.cpp:1791, name = 'codelite', stop reason = breakpoint 2.1 frame #0: 0x00000000006666b8 codeliteclMainFrame::OnAbout(this=0x0000000002ef0d80, (null)=0x00007fff2d666cd0) + 26 at frame.cpp:1791
    1788
    1789 void clMainFrame::OnAbout(wxCommandEvent& WXUNUSED(event))
    1790 {
    → 1791 wxString mainTitle;
    1792 mainTitle = CODELITE_VERSION_STR;
    1793
    1794 AboutDlg dlg(this, mainTitle);
    (lldb) image lookup --type wxClientDataContainer

(lldb)

and ofc, attempting to inspect mainTitle, resulted in crash

(lldb) p mainTitle
error: libwx_gtk2u_unofficial_core-3.0.so.0 DWARF DIE at 0x030ac4cc for class ‘wxSizer’ has a base class ‘wxClientDataContainer’ that is a forward declaration, not a complete definition.
Please file a bug against the compiler and include the preprocessed output for /home/david/devel/packages/wx/3.0-2/wxwidgets3.0-3.0.0/objs_gtk_sh/…/src/common/sizer.cpp
lldb: …/tools/clang/lib/AST/RecordLayoutBuilder.cpp:2844: const clang::ASTRecordLayout& clang::ASTContext::getASTRecordLayout(const clang::RecordDecl*) const: Assertion `D && “Cannot get layout of forward declarations!”’ failed.

Program received signal SIGABRT, Aborted.
Solution #2 is not an option for me, since the crash occurs in an external library which I did not build it myself and I am fetching it from an apt repository

BTW, I am guessing that this won’t be a problem on OSX? where everything is compiled with clang?

I tried your suggestion:

Process 17807 stopped
* thread #1: tid = 17807, 0x00000000006666b8 codelite`clMainFrame::OnAbout(this=0x0000000002ef0d80, (null)=0x00007fff2d666cd0) + 26 at frame.cpp:1791, name = 'codelite', stop reason = breakpoint 2.1
    frame #0: 0x00000000006666b8 codelite`clMainFrame::OnAbout(this=0x0000000002ef0d80, (null)=0x00007fff2d666cd0) + 26 at frame.cpp:1791
   1788
   1789 void clMainFrame::OnAbout(wxCommandEvent& WXUNUSED(event))
   1790 {
-> 1791 wxString mainTitle;
   1792 mainTitle = CODELITE_VERSION_STR;
   1793
   1794 AboutDlg dlg(this, mainTitle);
(lldb) image lookup --type wxClientDataContainer

Yep, so there is no actual definition for "wxClientDataContainer" in your executable. The whole premise of the GCC changes were that the definition would be available elsewhere, but in this case that isn't the case.

(lldb)

and ofc, attempting to inspect mainTitle, resulted in crash

(lldb) p mainTitle
error: libwx_gtk2u_unofficial_core-3.0.so.0 DWARF DIE at 0x030ac4cc for class 'wxSizer' has a base class 'wxClientDataContainer' that is a forward declaration, not a complete definition.
Please file a bug against the compiler and include the preprocessed output for /home/david/devel/packages/wx/3.0-2/wxwidgets3.0-3.0.0/objs_gtk_sh/../src/common/sizer.cpp
lldb: ../tools/clang/lib/AST/RecordLayoutBuilder.cpp:2844: const clang::ASTRecordLayout& clang::ASTContext::getASTRecordLayout(const clang::RecordDecl*) const: Assertion `D && "Cannot get layout of forward declarations!"' failed.

Yes, this is expected with the current LLDB.

Program received signal SIGABRT, Aborted.
Solution #2 is not an option for me, since the crash occurs in an external library which I did not build it myself and I am fetching it from an apt repository

BTW, I am guessing that this won't be a problem on OSX? where everything is compiled with clang?

Yes, by default on darwin clang doesn't use this debug info minimizing trick. On linux you will need to disable it if you have a clang that supports this optimization.

This compiler optimization is really lame. It does save space, but it causes the compiler to emit incomplete debug info. If your base class (wxClientDataContainer) has any member variables or methods, you won't be able to see any of the members that are missing and you won't be able to call any wxClientDataContainer methods.

Greg

The whole premise of the GCC changes were that the definition would be available elsewhere, but in this case that isn't the case.

I've run into the same issue on FreeBSD, while trying to debug a clang
built with the system clang-3.4.

* thread #1: tid = 104525, 0x000000081191ff6e
libclangCodeGen.so`clang::CodeGen::CGDebugInfo::CreateType(this=0x0000000815c0a000,
BT=0x0000000815c3d400) + 30 at CGDebugInfo.cpp:391, stop reason =
breakpoint 1.1
    frame #0: 0x000000081191ff6e
libclangCodeGen.so`clang::CodeGen::CGDebugInfo::CreateType(this=0x0000000815c0a000,
BT=0x0000000815c3d400) + 30 at CGDebugInfo.cpp:391
   388 /// CreateType - Get the Basic type from the cache or create a new
   389 /// one if necessary.
   390 llvm::DIType CGDebugInfo::CreateType(const BuiltinType *BT) {
-> 391 unsigned Encoding = 0;
   392 StringRef BTName;
   393 switch (BT->getKind()) {
   394 #define BUILTIN_TYPE(Id, SingletonId)
(lldb) p BT
Assertion failed: (D && "Cannot get layout of forward declarations!"),
function getASTRecordLayout, file
../tools/clang/lib/AST/RecordLayoutBuilder.cpp, line 2783.

In my case the there is a definition available elsewhere:

(lldb) image lookup --type BuiltinType
Best match found in /tank/emaste/ctsrd/llvm/Build/lib/libclangCodeGen.so:
id = {0x0088df97}, name = "BuiltinType", qualified =
"clang::BuiltinType", byte-size = 24, decl = Type.h:1842, clang_type =
"class BuiltinType : public clang::Type {
...

Yes, by default on darwin clang doesn't use this debug info minimizing trick. On linux you will need to disable it if you have a clang that supports this optimization.

Do you know off-hand how it's disabled on Darwin? It seems like we
probably want FreeBSD's clang to behave the same way, by default.

-fstandalone-debug, -fno-standalone-debug
     Clang supports a number of optimizations to reduce the size of debug information in the binary. They work based on the assumption that the debug type information can be spread out over multiple compilation units. For instance, Clang will not emit type definitions for types that are not needed by a module and could be replaced with a forward declaration. Further, Clang will only emit type info for a dynamic C++ class in the module that contains the vtable for the class.

The -fstandalone-debug option turns off these optimizations. This is useful when working with 3rd-party libraries that don't come with debug information. Note that Clang will never emit type information for types that are not referenced at all by the program.

cheers,
adrian

It's not in 3.4's man page :slight_smile:

Thanks for the pointer. I think we'll have to bring r198655 and
enable it on FreeBSD as well, until both DTrace and LLDB can handle
it.

-Ed