sharing code between lldb and AddressSanitizer

Hello,

I am working on integrating AddressSanitizer (aka asan, http://clang.llvm.org/docs/AddressSanitizer.html) run-time library with the llvm compiler-rt.
Asan needs to symbolize PCs, i.e. given a value of a PC it needs to produce the file name and the line number (if debug info is present).
Currently, this is achieved by printing the PCs as /path/to/object/file+offset and filtering the output with a script which uses addr2line/atos.
Ideally, symbolization should happen inside the process and should not require post processing.

I would expect that lldb already has such functionality, right?
Somewhere in include/lldb/Symbol/Symtab.h?
Does it work on both Linux and Mac?
Do you think that it is possible/desirable to have this kind of code sharing between lldb and asan?
Will that work with the current build system (where lldb and compiler-rt/lib/asan are separate subprojects)?

Thanks,

–kcc

I recently extracted the DWARF parsing that’s necessary to get line numbers from addresses out of lldb into LLVM. Take a look at llvm-dwarfdump and the DebugInfo library. It requires relocations to be resolved upfront though, so it won’t work on Linux out of the box (DWARF doesn’t use relocations on OS X).

-Ben

Hello,

I am working on integrating AddressSanitizer (aka asan, http://clang.llvm.org/docs/AddressSanitizer.html) run-time library with the llvm compiler-rt.
Asan needs to symbolize PCs, i.e. given a value of a PC it needs to produce the file name and the line number (if debug info is present).
Currently, this is achieved by printing the PCs as /path/to/object/file+offset and filtering the output with a script which uses addr2line/atos.
Ideally, symbolization should happen inside the process and should not require post processing.

I would expect that lldb already has such functionality, right?
Somewhere in include/lldb/Symbol/Symtab.h?
Does it work on both Linux and Mac?
Do you think that it is possible/desirable to have this kind of code sharing between lldb and asan?
Will that work with the current build system (where lldb and compiler-rt/lib/asan are separate subprojects)?

I recently extracted the DWARF parsing that’s necessary to get line numbers from addresses out of lldb into LLVM. Take a look at llvm-dwarfdump and the DebugInfo library.

Nice! This will solve the build dependency problem. We’ll look into it.

It requires relocations to be resolved upfront though, so it won’t work on Linux out of the box (DWARF doesn’t use relocations on OS X).

Ugh. :frowning:

–kcc

Yes, LLDB can do this.

When you are symbolicating, are you symbolicating using an address from a live process, or just using the virtual addresses in an object file itself?

If you are symbolicating using an address from the file, we already have a little C++ example for you:

https://llvm.org/svn/llvm-project/lldb/trunk/examples/lookup/main.cpp

If you want to load a bunch of files from a process at the addresses they were at when a backtrace or sample was taken, let me know. The example would change a little bit, but not too much.

Greg Clayton

Yes, LLDB can do this.

When you are symbolicating, are you symbolicating using an address from a live process,

This is how I want it to work. And I want it to happen inside that process.

or just using the virtual addresses in an object file itself?

This is how it works now.
asan prints a line like
/home/kcc/llvm/build/a.out+0x402661
and then addr2line/atos does symbolization offline.

–kcc

Yes, LLDB can do this.

When you are symbolicating, are you symbolicating using an address from a live process,

This is how I want it to work. And I want it to happen inside that process.

or just using the virtual addresses in an object file itself?

This is how it works now.
asan prints a line like
   /home/kcc/llvm/build/a.out+0x402661

So if 0x402661 is an address that is already in terms of the virtual addresses in the a.out file itself, you can use example code from main.cpp mentioned below.

You could compile the main.cpp into "loopup" and run the result:

lookup /home/kcc/llvm/build/a.out 0x402661

And it should do the lookup you want. Let me know if you have any questions about how and what the example code in main.cpp is doing.

Greg

So if 0x402661 is an address that is already in terms of the virtual addresses in the a.out file itself, you can use example code from main.cpp mentioned below.

You could compile the main.cpp into “loopup” and run the result:

lookup /home/kcc/llvm/build/a.out 0x402661

And it should do the lookup you want.

Yes, this is what we already get from addr2line.
Can this be used inside the process?
And can it translate real address to the offset in the library (currently, we use code from google perf tools to achieve that).
Will this work on both linux and mac?

Thanks,

–kcc

So if 0x402661 is an address that is already in terms of the virtual addresses in the a.out file itself, you can use example code from main.cpp mentioned below.

You could compile the main.cpp into "loopup" and run the result:

lookup /home/kcc/llvm/build/a.out 0x402661

And it should do the lookup you want.

Yes, this is what we already get from addr2line.
Can this be used inside the process?

Yes.

And can it translate real address to the offset in the library (currently, we use code from google perf tools to achieve that).

LLDB can't currently observer a process, it must debug it, but you can tell the target where each section of a shared library is loaded (a.out has ".text" is at 0x1000, a.out has ".data" at 0x2000). Then you can lookup using "Load" addresses.

You first need to create a target:

// Init LLDB
SBDebugger::Initialize();

// Create a debugger so we can make a target in it
SBDebugger debugger (SBDebugger::Create());

// Create a target and don't let it add all dependent shared libraries, we will add those manually
const bool add_dependent_files = false;
const char *triple = "i386-apple-darwin";
SBError error;
SBTarget target(debugger.CreateTarget ("/tmp/a.out", triple, NULL, add_dependent_files, error));

// Now add all of the shared libraries you want by repeating this loop
for (...)
{
  SBModule module = target.AddModule ("/tmp/libfoo.so", triple, NULL);
  target.SetSectionLoadAddress (module.FindSection ("__TEXT"), 0x1000);
  target.SetSectionLoadAddress (module.FindSection ("__DATA"), 0x2000);
}

Now you have a target that has all of the sections for all of your modules loaded at the addresses at which you want to do the lookups. To do a lookup you can now:

lldb::addr_t load_addr = ...; // The address to lookup

// Resolve a load address into a section + offset addresss within a module
SBAddress addr (target.ResolveLoadAddress (load_addr));
if (addr.IsValid())
{
    // Resolve the address into all of the symbol information
    SBSymbolContext symbol_ctx (addr.GetSymbolContext(eSymbolContextEverything));

    // symbol_ctx now contains the symbol context (module, compile unit, function,
    // block, line table entry and symbol for the address). Now you should dump the
    // information that you want out of the symbol context....
    DumpSymbolContext (symbol_ctx, addr);

    // This might represent a an inline function within a concrete function, so you
    // can also dump all of the parent functions above the current inline function.
    // An invalid symbol context will be returned when there are no more
    while (1)
    {
        SBAddress parent_addr; // The address in the parent function for the inline function
  SBSymbolContext parent_symbol_ctx = symbol_ctx.GetParentOfInlinedScope (addr, parent_addr);
        if (!parent_symbol_ctx.IsValid())
            break;
        DumpSymbolContext (parent_symbol_ctx, parent_addr);
        addr = parent_addr;
        symbol_ctx = parent_symbol_ctx;
    }
}

So we can do a very good job at symbolicating inlined functions within concrete functions, all from just a single address.

Will this work on both linux and mac?

Yep.

Very nice, thank you!
Let us try to use it.

–kcc

Greg Clayton <gclayton <at> apple.com> writes:

You first need to create a target:

// Init LLDB
SBDebugger::Initialize();

...

So we can do a very good job at symbolicating
inlined functions within concrete functions,
all from just a single address.

Hi,

I am with Kostya on it.
I've got to successfully build it on Linux
(the patches are already upstreamed).
Generally it works, thanks for all your work!

I use code similar to the above. The only difference is
that to "unwind" inlining I use:
  SBSymbolContext ctx = ...;
  SBBlock block = ctx.GetBlock();
  for (; block.IsValid(); block = block.GetParent()) {
    if (block.IsInlined())
      printf(" %s %s:%d\n", block.GetInlinedName(),
        block.GetInlinedCallSiteFile().GetFilename(), block.GetInlinedCallSiteLine());
  }
It seems to provide more precise function names.

As for factoring out some code into llvm to share it with ASan.
The code uses SBDebugger/SBTarget/SBModule/
SBSection/SBAddress/SBSymbolContext,
so it seems that it pulls basically whole lldb. I am not sure as
to whether it's possible to factor out it all
(then lldb will be empty:)).
What do you think?
We can use the dwarf reader that you already factored out.
But it seems a whole lot of work
to build the symbolizer on top of raw dwarf reader, right? So basically
we will have to double a lot of lldb code...

If we decide to depend on lldb, we will appreciate
if you provide a static lldb.a
(along with liblldb.so).

Ironically the symbolizer works great on gcc-compiled binaries,
but fails on clang-compiled binaries.
It provides some info but it's dead wrong (point into some
random STL source files).
objdump -dlS shows
correct info for the binaries, and I guess you mostly work with
clang-compiled binaries.
So are there any known problems? What may I be missing?
It's all on Linux/amd64.

TIA

Greg Clayton <gclayton <at> apple.com> writes:

You first need to create a target:

// Init LLDB
SBDebugger::Initialize();

...

So we can do a very good job at symbolicating
inlined functions within concrete functions,
all from just a single address.

Hi,

I am with Kostya on it.
I've got to successfully build it on Linux
(the patches are already upstreamed).
Generally it works, thanks for all your work!

I use code similar to the above. The only difference is
that to "unwind" inlining I use:
SBSymbolContext ctx = ...;
SBBlock block = ctx.GetBlock();
for (; block.IsValid(); block = block.GetParent()) {
   if (block.IsInlined())
     printf(" %s %s:%d\n", block.GetInlinedName(),
       block.GetInlinedCallSiteFile().GetFilename(), block.GetInlinedCallSiteLine());
}
It seems to provide more precise function names.

You need to be careful here as the current block contains the file and line of the location that _called_ (it is the callsite) the current inline block, it isn't the file and line of the inlined function itself.

So when you first lookup the address you need to print:

1 - the file and line for the address that was looked up (the SBLineEntry) with the function name for inlined_block[0]
2 - the callsite info from inlined_block[0] and the function name of inlined_block[1]
3 - the callsite info from inlined_block[1] and the function name of inlined_block[2]

So the file and line are always from the previous inlined block...

As for factoring out some code into llvm to share it with ASan.
The code uses SBDebugger/SBTarget/SBModule/
SBSection/SBAddress/SBSymbolContext,
so it seems that it pulls basically whole lldb. I am not sure as
to whether it's possible to factor out it all
(then lldb will be empty:)).
What do you think?

Probably not. Even though you are only seeing the SBDebugger, SBTarget, SBModule, SBSection, SBAddress, SBSymbolContext, you don't realize that underneath all of this the SBModule can be using one or more ObjectFile, SymbolFile, and SymbolVendor plug-ins.

We can use the dwarf reader that you already factored out.
But it seems a whole lot of work
to build the symbolizer on top of raw dwarf reader, right? So basically
we will have to double a lot of lldb code...

Yeah, that doesn't seem to be a good way to go about it.

If we decide to depend on lldb, we will appreciate if you provide a static lldb.a (along with liblldb.so).

That is how we build things on MacOSX, and the Makefiles already produce a bunch of .a files that you should be able to use (like clang and llvm). We do have a target (lldb-platform) that links against a .a file that contains all of the .o files from the core of LLDB that is produced by the MacOSX build and then we enable dead code stripping. You should be able to link against only the .a files that you need for your binary and have things work.

Ironically the symbolizer works great on gcc-compiled binaries,
but fails on clang-compiled binaries.
It provides some info but it's dead wrong (point into some
random STL source files).
objdump -dlS shows
correct info for the binaries, and I guess you mostly work with
clang-compiled binaries.
So are there any known problems? What may I be missing?

Not that we know of. Clang binaries work great on MacOSX and symbolicate just fine. If you have any quick examples where address lookups fail, please send me examples off the list.

It's all on Linux/amd64.

Send me some binaries that fail along with the address that is being used during the lookup and I will take a look.

Greg Clayton

You first need to create a target:

// Init LLDB
SBDebugger::Initialize();

So we can do a very good job at symbolicating
inlined functions within concrete functions,
all from just a single address.

Hi,

I am with Kostya on it.
I’ve got to successfully build it on Linux
(the patches are already upstreamed).
Generally it works, thanks for all your work!

I use code similar to the above. The only difference is
that to “unwind” inlining I use:
SBSymbolContext ctx = …;
SBBlock block = ctx.GetBlock();
for (; block.IsValid(); block = block.GetParent()) {
if (block.IsInlined())
printf(" %s %s:%d\n", block.GetInlinedName(),
block.GetInlinedCallSiteFile().GetFilename(), block.GetInlinedCallSiteLine());
}
It seems to provide more precise function names.

You need to be careful here as the current block contains the file and line of the location that called (it is the callsite) the current inline block, it isn’t the file and line of the inlined function itself.

So when you first lookup the address you need to print:

1 - the file and line for the address that was looked up (the SBLineEntry) with the function name for inlined_block[0]
2 - the callsite info from inlined_block[0] and the function name of inlined_block[1]
3 - the callsite info from inlined_block[1] and the function name of inlined_block[2]

So the file and line are always from the previous inlined block…

Thanks! I only saw that if I dump everything it contains all the required info, but I did not figure out the exact laws :slight_smile:

As for factoring out some code into llvm to share it with ASan.
The code uses SBDebugger/SBTarget/SBModule/
SBSection/SBAddress/SBSymbolContext,
so it seems that it pulls basically whole lldb. I am not sure as
to whether it’s possible to factor out it all
(then lldb will be empty:)).
What do you think?

Probably not. Even though you are only seeing the SBDebugger, SBTarget, SBModule, SBSection, SBAddress, SBSymbolContext, you don’t realize that underneath all of this the SBModule can be using one or more ObjectFile, SymbolFile, and SymbolVendor plug-ins.

We can use the dwarf reader that you already factored out.
But it seems a whole lot of work
to build the symbolizer on top of raw dwarf reader, right? So basically
we will have to double a lot of lldb code…

Yeah, that doesn’t seem to be a good way to go about it.

If we decide to depend on lldb, we will appreciate if you provide a static lldb.a (along with liblldb.so).

That is how we build things on MacOSX, and the Makefiles already produce a bunch of .a files that you should be able to use (like clang and llvm). We do have a target (lldb-platform) that links against a .a file that contains all of the .o files from the core of LLDB that is produced by the MacOSX build and then we enable dead code stripping. You should be able to link against only the .a files that you need for your binary and have things work.

Well, yes, we can link separate .a files.

I’ve tracked down the problem.
When I build the lookup example as
$ clang++ main.cpp -I…/…/include -llldb -g -frtti
It works.
However when I build it as:

$ clang++ main.cpp -I…/…/include -llldb -g -frtti -fPIE -pie

It fails to symbolize itself. While objdump -dSl symbolizes it (shows line numbers inside of functions). If I build a program with gcc with -fPIE -pie, it also able to symbolize itself (with lldb).
So, the problem seems to be in tricky interaction of clang, lldb and -pie.
It’s all Linux/amd64 and tip clang.

Ping. Any progress on this? It’s critical for us, we build everything only with -pie, so this thing renders lldb useless for us. I’ve filed an issue:
http://llvm.org/bugs/show_bug.cgi?id=12355

Dmitry,

Can you attach an example ELF file with DWARF and the main.cpp source file to the bug so that I can take a look? "-pie" is not a valid argument for our current darwin clang++, so this might be a linux specific compiler driver option.

Greg

Dmitry,

Can you attach an example ELF file with DWARF and the main.cpp source file to the bug so that I can take a look? “-pie” is not a valid argument for our current darwin clang++, so this might be a linux specific compiler driver option.

Hi Greg,

I’ve attached source and binary:
http://llvm.org/bugs/show_bug.cgi?id=12355

Dmitry,

I made a few fixes for the ELF symbol file reader where we now recognize mangled names and the symbols will now show up correctly demangled. I also fixed a few issues with the DWARF parser where we weren't using the dwarf section file size (we were using the vm size). I also fixed the sections that are registered to not have a VM size if they won't ever be loaded by the program. This means we won't end up with overlapping sections. The current LLDB was incorrectly giving the ".comment", all ".debug", and the ".shstrtab", ".symtab" and ".strtab" sections valid addresses that would be resolved when the shared library was loaded. To see this we can see the section list for a.out:

% lldb a.out
Current executable set to 'a.out' (x86_64).
(lldb) image dump sections
Dumping sections for 1 modules.
Sections for '/private/tmp/lookup/a.out' (x86_64):
  SectID Type File Address File Off. File Size Flags Section Name
  ---------- ---------------- --------------------------------------- ---------- ---------- ---------- ----------------------------
  0x00000001 regular 0x00000000 0x00000000 0x00000000 a.out.
  0x00000002 regular [0x0000000000000238-0x0000000000000254) 0x00000238 0x0000001c 0x00000002 a.out..interp
  0x00000003 regular [0x0000000000000254-0x0000000000000274) 0x00000254 0x00000020 0x00000002 a.out..note.ABI-tag
  0x00000004 regular [0x0000000000000274-0x0000000000000298) 0x00000274 0x00000024 0x00000002 a.out..note.gnu.build-id
  0x00000005 regular [0x0000000000000298-0x0000000000000528) 0x00000298 0x00000290 0x00000002 a.out..hash
  0x00000006 regular [0x0000000000000528-0x0000000000000584) 0x00000528 0x0000005c 0x00000002 a.out..gnu.hash
  0x00000007 regular [0x0000000000000588-0x0000000000000e70) 0x00000588 0x000008e8 0x00000002 a.out..dynsym
  0x00000008 regular [0x0000000000000e70-0x0000000000001973) 0x00000e70 0x00000b03 0x00000002 a.out..dynstr
  0x00000009 regular [0x0000000000001974-0x0000000000001a32) 0x00001974 0x000000be 0x00000002 a.out..gnu.version
  0x0000000a regular [0x0000000000001a38-0x0000000000001ac8) 0x00001a38 0x00000090 0x00000002 a.out..gnu.version_r
  0x0000000b regular [0x0000000000001ac8-0x0000000000001e70) 0x00001ac8 0x000003a8 0x00000002 a.out..rela.dyn
  0x0000000c regular [0x0000000000001e70-0x0000000000002578) 0x00001e70 0x00000708 0x00000002 a.out..rela.plt
  0x0000000d regular [0x0000000000002578-0x0000000000002590) 0x00002578 0x00000018 0x00000006 a.out..init
  0x0000000e regular [0x0000000000002590-0x0000000000002a50) 0x00002590 0x000004c0 0x00000006 a.out..plt
  0x0000000f code [0x0000000000002a50-0x0000000000004988) 0x00002a50 0x00001f38 0x00000006 a.out..text
  0x00000010 regular [0x0000000000004988-0x0000000000004996) 0x00004988 0x0000000e 0x00000006 a.out..fini
  0x00000011 regular [0x00000000000049a0-0x0000000000004b34) 0x000049a0 0x00000194 0x00000002 a.out..rodata
  0x00000012 regular [0x0000000000004b34-0x0000000000004ca8) 0x00004b34 0x00000174 0x00000002 a.out..eh_frame_hdr
  0x00000013 eh-frame [0x0000000000004ca8-0x0000000000005244) 0x00004ca8 0x0000059c 0x00000002 a.out..eh_frame
  0x00000014 regular [0x0000000000005244-0x00000000000058e0) 0x00005244 0x0000069c 0x00000002 a.out..gcc_except_table
  0x00000015 regular [0x0000000000205c48-0x0000000000205c58) 0x00005c48 0x00000010 0x00000003 a.out..ctors
  0x00000016 regular [0x0000000000205c58-0x0000000000205c68) 0x00005c58 0x00000010 0x00000003 a.out..dtors
  0x00000017 regular [0x0000000000205c68-0x0000000000205c70) 0x00005c68 0x00000008 0x00000003 a.out..jcr
  0x00000018 regular [0x0000000000205c70-0x0000000000205d88) 0x00005c70 0x00000118 0x00000003 a.out..data.rel.ro
  0x00000019 regular [0x0000000000205d88-0x0000000000205f88) 0x00005d88 0x00000200 0x00000003 a.out..dynamic
  0x0000001a regular [0x0000000000205f88-0x0000000000205fe0) 0x00005f88 0x00000058 0x00000003 a.out..got
  0x0000001b regular [0x0000000000205fe8-0x0000000000206258) 0x00005fe8 0x00000270 0x00000003 a.out..got.plt
  0x0000001c data [0x0000000000206258-0x0000000000206270) 0x00006258 0x00000018 0x00000003 a.out..data
  0x0000001d zero-fill [0x0000000000206270-0x0000000000207280) 0x00006270 0x00000000 0x00000003 a.out..bss
  0x0000001e regular [0x0000000000000000-0x0000000000000048) 0x00006270 0x00000048 0x00000030 a.out..comment
  0x0000001f dwarf-info [0x0000000000000000-0x000000000000dd7b) 0x000062b8 0x0000dd7b 0x00000000 a.out..debug_info
  0x00000020 dwarf-abbrev [0x0000000000000000-0x0000000000000502) 0x00014033 0x00000502 0x00000000 a.out..debug_abbrev
  0x00000021 dwarf-line [0x0000000000000000-0x0000000000000c33) 0x00014535 0x00000c33 0x00000000 a.out..debug_line
  0x00000022 dwarf-str [0x0000000000000000-0x0000000000013d9f) 0x00015168 0x00013d9f 0x00000030 a.out..debug_str
  0x00000023 dwarf-pubtypes [0x0000000000000000-0x0000000000000ba1) 0x00028f07 0x00000ba1 0x00000000 a.out..debug_pubtypes
  0x00000024 dwarf-ranges [0x0000000000000000-0x00000000000000f0) 0x00029aa8 0x000000f0 0x00000000 a.out..debug_ranges
  0x00000025 regular [0x0000000000000000-0x000000000000016c) 0x00029b98 0x0000016c 0x00000000 a.out..shstrtab
  0x00000026 regular [0x0000000000000000-0x0000000000001410) 0x0002a6c8 0x00001410 0x00000000 a.out..symtab
  0x00000027 regular [0x0000000000000000-0x0000000000001b0c) 0x0002bad8 0x00001b0c 0x00000000 a.out..strtab

After the my recent checking, we correctly disallow any sections that don't have the SHF_ALLOC bit set if the flags from having a valid VM address. Now the sections look like:

lldb a.out
Current executable set to 'a.out' (x86_64).
(lldb) image dump sections
Dumping sections for 1 modules.
Sections for '/private/tmp/lookup/a.out' (x86_64):
  SectID Type File Address File Off. File Size Flags Section Name
  ---------- ---------------- --------------------------------------- ---------- ---------- ---------- ----------------------------
  0x00000001 regular 0x00000000 0x00000000 0x00000000 a.out.
  0x00000002 regular [0x0000000000000238-0x0000000000000254) 0x00000238 0x0000001c 0x00000002 a.out..interp
  0x00000003 regular [0x0000000000000254-0x0000000000000274) 0x00000254 0x00000020 0x00000002 a.out..note.ABI-tag
  0x00000004 regular [0x0000000000000274-0x0000000000000298) 0x00000274 0x00000024 0x00000002 a.out..note.gnu.build-id
  0x00000005 regular [0x0000000000000298-0x0000000000000528) 0x00000298 0x00000290 0x00000002 a.out..hash
  0x00000006 regular [0x0000000000000528-0x0000000000000584) 0x00000528 0x0000005c 0x00000002 a.out..gnu.hash
  0x00000007 regular [0x0000000000000588-0x0000000000000e70) 0x00000588 0x000008e8 0x00000002 a.out..dynsym
  0x00000008 regular [0x0000000000000e70-0x0000000000001973) 0x00000e70 0x00000b03 0x00000002 a.out..dynstr
  0x00000009 regular [0x0000000000001974-0x0000000000001a32) 0x00001974 0x000000be 0x00000002 a.out..gnu.version
  0x0000000a regular [0x0000000000001a38-0x0000000000001ac8) 0x00001a38 0x00000090 0x00000002 a.out..gnu.version_r
  0x0000000b regular [0x0000000000001ac8-0x0000000000001e70) 0x00001ac8 0x000003a8 0x00000002 a.out..rela.dyn
  0x0000000c regular [0x0000000000001e70-0x0000000000002578) 0x00001e70 0x00000708 0x00000002 a.out..rela.plt
  0x0000000d regular [0x0000000000002578-0x0000000000002590) 0x00002578 0x00000018 0x00000006 a.out..init
  0x0000000e regular [0x0000000000002590-0x0000000000002a50) 0x00002590 0x000004c0 0x00000006 a.out..plt
  0x0000000f code [0x0000000000002a50-0x0000000000004988) 0x00002a50 0x00001f38 0x00000006 a.out..text
  0x00000010 regular [0x0000000000004988-0x0000000000004996) 0x00004988 0x0000000e 0x00000006 a.out..fini
  0x00000011 regular [0x00000000000049a0-0x0000000000004b34) 0x000049a0 0x00000194 0x00000002 a.out..rodata
  0x00000012 regular [0x0000000000004b34-0x0000000000004ca8) 0x00004b34 0x00000174 0x00000002 a.out..eh_frame_hdr
  0x00000013 eh-frame [0x0000000000004ca8-0x0000000000005244) 0x00004ca8 0x0000059c 0x00000002 a.out..eh_frame
  0x00000014 regular [0x0000000000005244-0x00000000000058e0) 0x00005244 0x0000069c 0x00000002 a.out..gcc_except_table
  0x00000015 regular [0x0000000000205c48-0x0000000000205c58) 0x00005c48 0x00000010 0x00000003 a.out..ctors
  0x00000016 regular [0x0000000000205c58-0x0000000000205c68) 0x00005c58 0x00000010 0x00000003 a.out..dtors
  0x00000017 regular [0x0000000000205c68-0x0000000000205c70) 0x00005c68 0x00000008 0x00000003 a.out..jcr
  0x00000018 regular [0x0000000000205c70-0x0000000000205d88) 0x00005c70 0x00000118 0x00000003 a.out..data.rel.ro
  0x00000019 regular [0x0000000000205d88-0x0000000000205f88) 0x00005d88 0x00000200 0x00000003 a.out..dynamic
  0x0000001a regular [0x0000000000205f88-0x0000000000205fe0) 0x00005f88 0x00000058 0x00000003 a.out..got
  0x0000001b regular [0x0000000000205fe8-0x0000000000206258) 0x00005fe8 0x00000270 0x00000003 a.out..got.plt
  0x0000001c data [0x0000000000206258-0x0000000000206270) 0x00006258 0x00000018 0x00000003 a.out..data
  0x0000001d zero-fill [0x0000000000206270-0x0000000000207280) 0x00006270 0x00000000 0x00000003 a.out..bss
  0x0000001e regular 0x00006270 0x00000048 0x00000030 a.out..comment
  0x0000001f dwarf-info 0x000062b8 0x0000dd7b 0x00000000 a.out..debug_info
  0x00000020 dwarf-abbrev 0x00014033 0x00000502 0x00000000 a.out..debug_abbrev
  0x00000021 dwarf-line 0x00014535 0x00000c33 0x00000000 a.out..debug_line
  0x00000022 dwarf-str 0x00015168 0x00013d9f 0x00000030 a.out..debug_str
  0x00000023 dwarf-pubtypes 0x00028f07 0x00000ba1 0x00000000 a.out..debug_pubtypes
  0x00000024 dwarf-ranges 0x00029aa8 0x000000f0 0x00000000 a.out..debug_ranges
  0x00000025 regular 0x00029b98 0x0000016c 0x00000000 a.out..shstrtab
  0x00000026 regular 0x0002a6c8 0x00001410 0x00000000 a.out..symtab
  0x00000027 regular 0x0002bad8 0x00001b0c 0x00000000 a.out..strtab

Note from the ".comment" section on down, they no longer show file addresses.

Now we can try some symbolication with LLDB on the binary:

% lldb a.out
Current executable set to 'a.out' (x86_64).
(lldb) target modules lo
Available completions:
  load
  lookup
(lldb) image lookup -va 0x0000000000002d90
      Address: a.out[0x0000000000002d90] (a.out..text + 832)
      Summary: a.out`main + 32 at main.cpp:176
       Module: file = "/private/tmp/lookup/a.out", arch = "x86_64"
  CompileUnit: id = {0x00000000}, file = "/usr/local/google/home/dvyukov/llvm/tools/lldb/examples/lookup/main.cpp", language = "ISO C++:1998"
     Function: id = {0x000018c2}, name = "main", range = [0x0000000000002d70-0x000000000000332c)
     FuncType: id = {0x000018c2}, decl = main.cpp:155, clang_type = "int (int, const char **)"
       Blocks: id = {0x000018c2}, range = [0x00002d70-0x0000332c)
               id = {0x000018fc}, ranges = [0x00002d90-0x000032e8)[0x000032f4-0x0000332c)
    LineEntry: [0x0000000000002d90-0x0000000000002de6): /usr/local/google/home/dvyukov/llvm/tools/lldb/examples/lookup/main.cpp:176:68
       Symbol: uid={ 167}, range = [0x0000000000002d70-0x000000000000332c), name="main"
     Variable: id = {0x00001901}, name = "ls", type= "StreamSP", location = DW_OP_fbreg(-80), decl = main.cpp:176
     Variable: id = {0x00001910}, name = "la", type= "const char *", location = DW_OP_fbreg(-104), decl = main.cpp:177
     Variable: id = {0x0000191f}, name = "log", type= "LogSP", location = DW_OP_fbreg(-120), decl = main.cpp:178
     Variable: id = {0x0000192e}, name = "exe_file_path", type= "const char *", location = DW_OP_fbreg(-128), decl = main.cpp:185
     Variable: id = {0x0000193d}, name = "file_addr", type= "addr_t", location = DW_OP_fbreg(-136), decl = main.cpp:187
     Variable: id = {0x0000194c}, name = "debugger", type= "SBDebugger", location = DW_OP_fbreg(-168), decl = main.cpp:193
     Variable: id = {0x000018e0}, name = "argc", type= "int", location = DW_OP_fbreg(-56), decl = main.cpp:154
     Variable: id = {0x000018ee}, name = "argv", type= "const char **", location = DW_OP_fbreg(-64), decl = main.cpp:154

Not this lookup succeeded before we set the load address. In your main.cpp that is attached to the bug, you then slide all of the sections by a given address using:

    target.SetModuleLoadAddress(module, (lldb::addr_t)info->dlpi_addr);

I can simulate sliding the entire image by 0x1000 by using the "target modules load --slide <slide>" command:

(lldb) target modules load --file a.out --slide 0x1000

Now if we dump the sections we see them at their "load" address:

(lldb) image dump sections
Dumping sections for 1 modules.
Sections for '/private/tmp/lookup/a.out' (x86_64):
  SectID Type Load Address File Off. File Size Flags Section Name
  ---------- ---------------- --------------------------------------- ---------- ---------- ---------- ----------------------------
  0x00000001 regular 0x00000000 0x00000000 0x00000000 a.out.
  0x00000002 regular [0x0000000000001238-0x0000000000001254) 0x00000238 0x0000001c 0x00000002 a.out..interp
  0x00000003 regular [0x0000000000001254-0x0000000000001274) 0x00000254 0x00000020 0x00000002 a.out..note.ABI-tag
  0x00000004 regular [0x0000000000001274-0x0000000000001298) 0x00000274 0x00000024 0x00000002 a.out..note.gnu.build-id
  0x00000005 regular [0x0000000000001298-0x0000000000001528) 0x00000298 0x00000290 0x00000002 a.out..hash
  0x00000006 regular [0x0000000000001528-0x0000000000001584) 0x00000528 0x0000005c 0x00000002 a.out..gnu.hash
  0x00000007 regular [0x0000000000001588-0x0000000000001e70) 0x00000588 0x000008e8 0x00000002 a.out..dynsym
  0x00000008 regular [0x0000000000001e70-0x0000000000002973) 0x00000e70 0x00000b03 0x00000002 a.out..dynstr
  0x00000009 regular [0x0000000000002974-0x0000000000002a32) 0x00001974 0x000000be 0x00000002 a.out..gnu.version
  0x0000000a regular [0x0000000000002a38-0x0000000000002ac8) 0x00001a38 0x00000090 0x00000002 a.out..gnu.version_r
  0x0000000b regular [0x0000000000002ac8-0x0000000000002e70) 0x00001ac8 0x000003a8 0x00000002 a.out..rela.dyn
  0x0000000c regular [0x0000000000002e70-0x0000000000003578) 0x00001e70 0x00000708 0x00000002 a.out..rela.plt
  0x0000000d regular [0x0000000000003578-0x0000000000003590) 0x00002578 0x00000018 0x00000006 a.out..init
  0x0000000e regular [0x0000000000003590-0x0000000000003a50) 0x00002590 0x000004c0 0x00000006 a.out..plt
  0x0000000f code [0x0000000000003a50-0x0000000000005988) 0x00002a50 0x00001f38 0x00000006 a.out..text
  0x00000010 regular [0x0000000000005988-0x0000000000005996) 0x00004988 0x0000000e 0x00000006 a.out..fini
  0x00000011 regular [0x00000000000059a0-0x0000000000005b34) 0x000049a0 0x00000194 0x00000002 a.out..rodata
  0x00000012 regular [0x0000000000005b34-0x0000000000005ca8) 0x00004b34 0x00000174 0x00000002 a.out..eh_frame_hdr
  0x00000013 eh-frame [0x0000000000005ca8-0x0000000000006244) 0x00004ca8 0x0000059c 0x00000002 a.out..eh_frame
  0x00000014 regular [0x0000000000006244-0x00000000000068e0) 0x00005244 0x0000069c 0x00000002 a.out..gcc_except_table
  0x00000015 regular [0x0000000000206c48-0x0000000000206c58) 0x00005c48 0x00000010 0x00000003 a.out..ctors
  0x00000016 regular [0x0000000000206c58-0x0000000000206c68) 0x00005c58 0x00000010 0x00000003 a.out..dtors
  0x00000017 regular [0x0000000000206c68-0x0000000000206c70) 0x00005c68 0x00000008 0x00000003 a.out..jcr
  0x00000018 regular [0x0000000000206c70-0x0000000000206d88) 0x00005c70 0x00000118 0x00000003 a.out..data.rel.ro
  0x00000019 regular [0x0000000000206d88-0x0000000000206f88) 0x00005d88 0x00000200 0x00000003 a.out..dynamic
  0x0000001a regular [0x0000000000206f88-0x0000000000206fe0) 0x00005f88 0x00000058 0x00000003 a.out..got
  0x0000001b regular [0x0000000000206fe8-0x0000000000207258) 0x00005fe8 0x00000270 0x00000003 a.out..got.plt
  0x0000001c data [0x0000000000207258-0x0000000000207270) 0x00006258 0x00000018 0x00000003 a.out..data
  0x0000001d zero-fill [0x0000000000207270-0x0000000000208280) 0x00006270 0x00000000 0x00000003 a.out..bss
  0x0000001e regular 0x00006270 0x00000048 0x00000030 a.out..comment
  0x0000001f dwarf-info 0x000062b8 0x0000dd7b 0x00000000 a.out..debug_info
  0x00000020 dwarf-abbrev 0x00014033 0x00000502 0x00000000 a.out..debug_abbrev
  0x00000021 dwarf-line 0x00014535 0x00000c33 0x00000000 a.out..debug_line
  0x00000022 dwarf-str 0x00015168 0x00013d9f 0x00000030 a.out..debug_str
  0x00000023 dwarf-pubtypes 0x00028f07 0x00000ba1 0x00000000 a.out..debug_pubtypes
  0x00000024 dwarf-ranges 0x00029aa8 0x000000f0 0x00000000 a.out..debug_ranges
  0x00000025 regular 0x00029b98 0x0000016c 0x00000000 a.out..shstrtab
  0x00000026 regular 0x0002a6c8 0x00001410 0x00000000 a.out..symtab
  0x00000027 regular 0x0002bad8 0x00001b0c 0x00000000 a.out..strtab

And if we lookup an address in main again, we see the correct results:

(lldb) image lookup -va 0x0000000000003d90
      Address: a.out[0x0000000000002d90] (a.out..text + 832)
      Summary: a.out`main + 32 at main.cpp:176
       Module: file = "/private/tmp/lookup/a.out", arch = "x86_64"
  CompileUnit: id = {0x00000000}, file = "/usr/local/google/home/dvyukov/llvm/tools/lldb/examples/lookup/main.cpp", language = "ISO C++:1998"
     Function: id = {0x000018c2}, name = "main", range = [0x0000000000003d70-0x000000000000432c)
     FuncType: id = {0x000018c2}, decl = main.cpp:155, clang_type = "int (int, const char **)"
       Blocks: id = {0x000018c2}, range = [0x00003d70-0x0000432c)
               id = {0x000018fc}, ranges = [0x00003d90-0x000042e8)[0x000042f4-0x0000432c)
    LineEntry: [0x0000000000003d90-0x0000000000003de6): /usr/local/google/home/dvyukov/llvm/tools/lldb/examples/lookup/main.cpp:176:68
       Symbol: uid={ 167}, range = [0x0000000000003d70-0x000000000000432c), name="main"
     Variable: id = {0x00001901}, name = "ls", type= "StreamSP", location = DW_OP_fbreg(-80), decl = main.cpp:176
     Variable: id = {0x00001910}, name = "la", type= "const char *", location = DW_OP_fbreg(-104), decl = main.cpp:177
     Variable: id = {0x0000191f}, name = "log", type= "LogSP", location = DW_OP_fbreg(-120), decl = main.cpp:178
     Variable: id = {0x0000192e}, name = "exe_file_path", type= "const char *", location = DW_OP_fbreg(-128), decl = main.cpp:185
     Variable: id = {0x0000193d}, name = "file_addr", type= "addr_t", location = DW_OP_fbreg(-136), decl = main.cpp:187
     Variable: id = {0x0000194c}, name = "debugger", type= "SBDebugger", location = DW_OP_fbreg(-168), decl = main.cpp:193
     Variable: id = {0x000018e0}, name = "argc", type= "int", location = DW_OP_fbreg(-56), decl = main.cpp:154
     Variable: id = {0x000018ee}, name = "argv", type= "const char **", location = DW_OP_fbreg(-64), decl = main.cpp:154

Dmitry: are you seeing an address not work correctly when you look it up? If you can print out the address you are using to load your PIE executable by uncommenting:

  //printf("LOADING '%s' at %p\n", info->dlpi_name, (void*)info->dlpi_addr);

And send me the addresses that you then are looking up, I might be able to help, but from what I see this is working as expected (after my fixes):

% svn commit
Sending source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
Sending source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
Sending source/Symbol/ObjectFile.cpp
Transmitting file data ...
Committed revision 153496.

Let me know if this now works for you?

Greg Clayton