LLDB 3.9 on Linux crashes when loading core dump

Hi,

I have a Linux core dump that causes LLDB 3.9 on Linux crash. I would greatly appreciate any advise how to deal with the problem or what else I should look at.

The core dump was produced by GDB and GDB itself opens it without problems.

So, during loading the core we call DynamicLoaderPOSIXDYLD::LoadAllCurrentModules() which enumerates all the modules and does some processing. In the course of actions, it calls the ObjectFileELF::GetSectionHeaderInfo() for each module. This guy tries to load section headers and read string table. Well, it gets some garbage in the section header struct and tries to allocate 1.5TB memory which causes operator new throw.

So, why we get garbage?

The module in question is libc++abi.so.1:

521 ModuleSP module_sp = LoadModuleAtAddress(I->file_spec, I->link_addr, I->base_addr, true);

(gdb) p I->file_spec

$95 = {

m_directory = {

m_string = 0x829a58 “… redacted …”

},

m_filename = {

m_string = 0x7cc9e8 “libc++abi.so.1”

},

m_is_resolved = false,

m_syntax = lldb_private::FileSpec::ePathSyntaxPosix

}

The module header lives at address 0x7f699a270000 and looks OK. The section headers are supposed to be at offset 2495600 == 0x261470

$96 = (const elf::ELFHeader &) @0x953a78: {

e_ident = “\177ELF\002\001\001\000\000\000\000\000\000\000\000”,

e_entry = 33392,

e_phoff = 64,

e_shoff = 2495600,

e_flags = 0,

e_version = 1,

e_type = 3,

e_machine = 62,

e_ehsize = 64,

e_phentsize = 56,

e_phnum = 7,

e_shentsize = 64,

e_shnum = 38,

e_shstrndx = 35

}

LLDB tries to read the section headers from that address 0x7f699a270000 + 0x261470 == 0x7F699A4D1470 without a second thought, but this number is a lie. The /proc//maps file shows it as belonging to something else:

7f699a270000-7f699a2ba000 r-xp 00000000 fd:02 537796791 …/libc++abi.so.1
7f699a2ba000-7f699a4b9000 —p 0004a000 fd:02 537796791 …/libc++abi.so.1
7f699a4b9000-7f699a4bb000 r–p 00049000 fd:02 537796791 …/libc++abi.so.1
7f699a4bb000-7f699a4bc000 rw-p 0004b000 fd:02 537796791 …/libc++abi.so.1
7f699a4bc000-7f699a520000 r-xp 00000000 fd:00 202587414 /usr/lib64/libssl.so.1.0.1e
7f699a520000-7f699a71f000 —p 00064000 fd:00 202587414 /usr/lib64/libssl.so.1.0.1e
7f699a71f000-7f699a723000 r–p 00063000 fd:00 202587414 /usr/lib64/libssl.so.1.0.1e
7f699a723000-7f699a72a000 rw-p 00067000 fd:00 202587414 /usr/lib64/libssl.so.1.0.1e

I.e. LLDB should verify the module boundaries and fall back to some other plan if the memory is not there.

Now the question is - where would be the right place to do the fix?

Thanks,

Eugene

Hello Eugene,

I have been aware of this problem for a while, but I haven't found a
really good solution so far, partially due to lack of a good repro
case -- I think your analysis has helped me with this, and I am
finally starting to piece together the sequence of events leading to
the crash. If you have a repro case you can send me, it would be even
better.

I don't really have an answer to your quesiton, but here are a couple
of observations (the details might be a bit sketchy - it's been a long
time since I looked at this):
- reading the section headers from memory should be a fallback.
Normally we try first to locate the file on disk and read data from
there. This was mainly added for the vdso "module", as that is not
really present on disk. One of the ways of fixing this crash could be
to figure out why we are not finding the c++abi binary on disk.

- we trust far too much the data we read from inferior memory. We
should be much more careful when doing things based on "untrusted"
data. Checking the memory maps as you suggest could be one idea.
Another option I am considering is teaching ReadMemory to allocate
data in (reasonably sized, say a couple of MB) chunks. Right now it
allocates the full buffer without even trying to read the memory. If
it instead tried to read data in smaller chunks it would error out due
to failure to read inferior memory long before it gets a chance to
exhaust own address space. (With a sufficiently large chunk, this
should not affect performance of normal reads).

hope that helps,
pl

Hello Pavel,

Thanks for the reply. Unfortunately I cannot share the core dump with you.

Yes, Rob has figured that LLDB does not find this shared library and that causes the problem. To understand what is going on here, I need to add one more detail that was missing from my original post: this is a cross-machine investigation. I.e. the core dump collected on one machine (CentOs) was sent to another

(Ubuntu) where I tried to open it.

LLDB opens the executable without paying attention that there is a core dump attached and tries to locate shared libraries. Apparently the ones that exist on my machine are not good, so it then looks in the directory with the executable itself. There is no way to “set solib-search-path” as we do on GDB and force it to look somewhere else. After we dumped all the shared libraries in the folder with the executable LLDB was able to open the dump. This is a bit inconvenient, but works as a workaround for now.

Try “image search-paths add” as a replacement for “set solib-search-path”

If that works, can you add it to the lldb-gdb.html document?

Jim

Sorry, never done that - where this HTML is located and what is the procedure of updating it?

All the lldb.llvm.org web pages are taken from the www directory in the sources. So just change it there and the web page will get updated.

The one you want is lldb-gdb.html.

Jim

No problem, I think this discussion has helped me understand the
problem, and it shouldn't be too hard to reproduce it locally. Now I
just need to find some time to do that. :slight_smile:

cheers,
pl