Linux Core Dump and Symbol resolution

Hi,

I am new to lldb and creating a patch to support Linux coredumps.
This plugin is based on mach-core plugin.
Currently it can parss the NOTE segments and loads all the threads found in the corefile(x86_64).
That is “thread list” works fine.

It also reads the PRSTATUS structure and populates the register infromation.
That is “register read” works fine.

However lldb is not using the symbol files while using the core file. Because of this it is not using DWARF structures while creating frames. That is frame variables and arguments are not available. Also lldb not resolving address to symbols.

$lldb
(lldb) target create -c ./core
Core file ‘/mts/home3/jacobs/test/core’ (x86_64) was loaded.
Process 0 stopped

  • thread #1: tid = 0x0000, 0x00000000004004c4, stop reason = signal SIGCONT
    frame #0: 0x00000000004004c4
    error: core file does not contain 0x4004c4
    (lldb) target modules add ./a.out
    (lldb) image lookup --address 0x4004c4
    Address: a.out[0x00000000004004c4] (a.out…text + 244)
    Summary: a.out`function4 + 16 at test.c:4
    (lldb) bt
  • thread #1: tid = 0x0000, 0x00000000004004c4, stop reason = signal SIGCONT
    frame #0: 0x00000000004004c4
    frame #1: 0x00000000004004d7
    frame #2: 0x00000000004004e7
    frame #3: 0x00000000004004f7
    frame #4: 0x0000000000400507
    (lldb) image lookup --address 0x00000000004004d7
    Address: a.out[0x00000000004004d7] (a.out…text + 263)
    Summary: a.out`function3 + 11 at test.c:8

In the above example the IP’s are not resolved to symbol in “bt” although lldb is able to resolve the addresses using “image lookukp” . What command should be used to link a target with symbol file?

Here is the program that I used which compiled with “gcc -O0 -g3”

$cat test.c
void function4(unsigned int arg)
{
char *local = 0;
*local = 0;
}
void function3()
{
function4(-1);
}
void function2(long arg)
{
function3();
}
void function1(int arg1, long arg2, char *str)
{
function2(1);
}
void main()
{
function1(0, 1L, “Test\n”);
}

GDB output

$gdb --quiet a.out core
Reading symbols from /mts/home3/jacobs/test/a.out…done.

warning: exec file is newer than core file.
[New LWP 26718]

warning: Can’t read pathname for load map: Input/output error.
Core was generated by `./a.out’.
Program terminated with signal 11, Segmentation fault.
#0 0x00000000004004c4 in function4 (arg=0) at test.c:4
4 *local = 0;
(gdb) bt
#0 0x00000000004004c4 in function4 (arg=0) at test.c:4
#1 0x00000000004004d7 in function3 () at test.c:8
#2 0x00000000004004e7 in function2 (arg=4195559) at test.c:11
#3 0x00000000004004f7 in function1 (arg1=0, arg2=140736328348032, str=0x4004e7 <incomplete sequence \370\270>) at test.c:15
#4 0x0000000000400507 in function1 (arg1=0, arg2=140736328348048, str=0x4004f7 “\345H\203\354\030\211}\374H\211u\360H\211U\350\277\001”) at test.c:15
#5 0x00007fbcdfe6c76d in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x00000000004003f9 in _start ()

I can post the patch if anyone interested(but it needs to be cleaned up).

Thanks
Samuel

Samuel:

My guess as to why the symbols aren't resolving is the dynamic loader plug-in for Linux most likely isn't being selected.

lldb_private::DynamicLoader *
ProcessMachCore::GetDynamicLoader ()
{
    if (m_dyld_ap.get() == NULL)
        m_dyld_ap.reset (DynamicLoader::FindPlugin(this, m_dyld_plugin_name.empty() ? NULL : m_dyld_plugin_name.c_str()));
    return m_dyld_ap.get();
}

In the mach-o case, we can either be connecting to a kernel core file, or a user space core file and the dynamic loader will be different. The Mach-O plug-in determines which one in:

bool
ProcessMachCore::GetDynamicLoaderAddress (lldb::addr_t addr);

For now you can probably just hard code your

lldb_private::DynamicLoader *
ProcessELFCore::GetDynamicLoader ()
{
    if (m_dyld_ap.get() == NULL)
        m_dyld_ap.reset (DynamicLoader::FindPlugin(this, "linux-dyld"));
    return m_dyld_ap.get();
}

If the ELF core file plug-in is functional, please do commit it, or if you don't have commit access, just send a patch to this group and one of us will commit it for you.

Greg Clayton

Thanks Greg for the suggestion.
But still symbol resolution is not working.

I am still browsing the lldb source and learning the architecture.
So I am attaching my patch, please go through and let me know if I am missing something,
If you can apply and debug the problem it would be great.

Note - I will cleanup and add more comments and send new one for committing, once this problem is fixed and after adding AUXV processing code.

Samuel

elfcore.diff (39.9 KB)

Attaching core, symbol and source files.

core.7z (12 KB)

Greg and others,

Update on the Linux coredump:
It works if use “image load --file xxx .text yyyy”.

$ Release+Asserts/bin/lldb

(lldb) target create -c ~/test/core
Core file ‘/mts/home3/jacobs/test/core’ (x86_64) was loaded.
Process 0 stopped

  • thread #1: tid = 0x0000, 0x00000000004004c4, stop reason = signal SIGCONT
    frame #0: 0x00000000004004c4
    → 0x4004c4: movb $0, (%rax)
    0x4004c7: popq %rbp
    0x4004c8: ret
    0x4004c9: pushq %rbp
    (lldb) target modules add ~/test/a.out
    (lldb) image load --file ~/test/a.out .text 0x4003d0

section ‘.text’ loaded at 0x4003d0
(lldb) bt

  • thread #1: tid = 0x0000, 0x00000000004004c4 a.outfunction4(arg=0) + 16 at test.c:4, stop reason = signal SIGCONT frame #0: 0x00000000004004c4 a.outfunction4(arg=0) + 16 at test.c:4
    frame #1: 0x00000000004004d7 a.outfunction3 + 11 at test.c:8 frame #2: 0x00000000004004e7 a.outfunction2(arg=4195559) + 11 at test.c:11
    frame #3: 0x00000000004004f7 a.outfunction1(arg1=0, arg2=140736328348032, str=0x00000000004004e7) + 3 at test.c:15 frame #4: 0x0000000000400507 a.outfunction1(arg1=0, arg2=140736328348048, str=0x00000000004004f7) + 19 at test.c:15

I didnt try this previously since I thought AUXV->AT_ENTRY processing is needed only if the load address is different.

Is there a reason why lldb is not using the symbol files virtual address as load address if load is not provided?

$eu-readelf -S ~/test/a.out | grep .text
[13] .text PROGBITS 00000000004003d0 000003d0 00000238 0 AX 0 0 16

$eu-readelf -n ~/test/core | grep ENTRY
ENTRY: 0x4003d0

I tried running lldb on few core files and all worked fine.
I will send the patch once I complete the following

  1. Insert thread name, signal values
  2. Insert AT_ENTRY
  3. Add comments to the code
  4. More testing

Thanks

Samuel

Greg and others,

Update on the Linux coredump:
It works if use "image load --file xxx .text yyyy".

This means the dynamic loader isn't working. You should debug through the dynamic loader and see why it isn't loading your objects where they should be.

The dynamic loader will get its "DidAttach()" function called from Process::LoadCore(), so set a breakpoint in there and see what is going wrong. In the MacOSX dynamic loader in DidAttach, we try and locate the list of shared libraries by grubbing through memory and loading everything where it should be.

$ Release+Asserts/bin/lldb
(lldb) target create -c ~/test/core
Core file '/mts/home3/jacobs/test/core' (x86_64) was loaded.
Process 0 stopped
* thread #1: tid = 0x0000, 0x00000000004004c4, stop reason = signal SIGCONT
    frame #0: 0x00000000004004c4
-> 0x4004c4: movb $0, (%rax)
   0x4004c7: popq %rbp
   0x4004c8: ret
   0x4004c9: pushq %rbp
(lldb) target modules add ~/test/a.out
(lldb) image load --file ~/test/a.out .text 0x4003d0
section '.text' loaded at 0x4003d0
(lldb) bt
* thread #1: tid = 0x0000, 0x00000000004004c4 a.out`function4(arg=0) + 16 at test.c:4, stop reason = signal SIGCONT
    frame #0: 0x00000000004004c4 a.out`function4(arg=0) + 16 at test.c:4
    frame #1: 0x00000000004004d7 a.out`function3 + 11 at test.c:8
    frame #2: 0x00000000004004e7 a.out`function2(arg=4195559) + 11 at test.c:11
    frame #3: 0x00000000004004f7 a.out`function1(arg1=0, arg2=140736328348032, str=0x00000000004004e7) + 3 at test.c:15
    frame #4: 0x0000000000400507 a.out`function1(arg1=0, arg2=140736328348048, str=0x00000000004004f7) + 19 at test.c:15

I didnt try this previously since I thought AUXV->AT_ENTRY processing is needed only if the load address is different.

No, you must still tell a file that its load address is the same. The reason being is all shared libraries start at zero and if you let all binaries overlap at this address, you end up with problems.

Is there a reason why lldb is not using the symbol files virtual address as load address if load is not provided?

Yes, for the shared library reasons mentioned above. There is an easy command to load a library right where it lives:

(lldb) target modules load --slide 0 --file a.out

This will take a slide of zero and add it to each "file" address for each section and load each section at that same address.

$eu-readelf -S ~/test/a.out | grep .text
[13] .text PROGBITS 00000000004003d0 000003d0 00000238 0 AX 0 0 16

$eu-readelf -n ~/test/core | grep ENTRY
    ENTRY: 0x4003d0

I tried running lldb on few core files and all worked fine.
I will send the patch once I complete the following
1) Insert thread name, signal values
2) Insert AT_ENTRY
3) Add comments to the code
4) More testing

Nice work! I look forward to seeing the final patch.

Greg