[RFC] Extend ObjectFile interface.

*,

I have been prototyping a DynamicLoader plugin for the Linux platform.
Current state of the effort is available here:

  https://github.com/eightcien/lldb/tree/lldb-linux

On ELF platforms there is a "rendezvous" structure in a processes
address space populated by the runtime linker for use by debuggers.
This structure provides the address of a function which is called by the
linker each time a shared object is loaded and unloaded (thus a
breakpoint at that address will let a debugger intercept such events), a
list of entries describing the currently loaded shared objects, plus a
few other things.

In order to locate this structure one must interrogate the object file
for the process. The location of this rendezvous is provided by a
DT_DEBUG entry in the processes .dynamic section. The actual address
must be reaped by looking at the loaded .dynamic section contents of the
process.

The first proposal is to add the following method to ObjectFile, which
allows us to locate the needed information:

   //------------------------------------------------------------------
   /// Similar to Process::GetImageInfoAddress().
   ///
   /// Some platforms embed auxiliary structures useful to debuggers in the
   /// address space of the inferior process. This method returns the address
   /// of such a structure if the information can be resolved via entries in
   /// the object file. ELF, for example, provides a means to hook into the
   /// runtime linker so that a debugger may monitor the loading and unloading
   /// of shared libraries.
   ///
   /// @return
   /// The address of any auxiliary tables, or an invalid address if this
   /// object file format does not support or contain such information.
   virtual lldb_private::Address
   GetImageInfoAddress () { return Address(); }

In my prototype the above method is implemented by ObjectFileELF, and
Process::GetImageInfoAddress in turn uses this information.

I use an Address object instead of an lldb::addr_t as it is slightly
more generic, and perhaps more useful, to represent the info as a
"section load address + offset".

Yes, this looks fine. The "Address" is the right thing to use since it can point you back to your module.

A quick note to everyone: if you have an Address can you get the module from which is comes which can then get your to your object and symbol files:

Address addr (...);
Module *module = addr.GetModule();
// Check the module in case "addr" was not section/offset (the section might be NULL)
if (module)
{
    ObjectFile *objfile = module->GetObjectFile();
    SymbolVendor*symbols = module->GetSymbolVendor();
    ...
}

Since addresses are stored as section offset, when shared libraries get loaded, the dynamic loader plug-ins tell the target where each section got loaded. Then we can resolve addresses as needed.

For example, if you have a PC value 0x100020 that you got from a register in a thread, you can then resolve it:

addr_t pc = 0x100020;
Address pc_addr;
if (thread.GetProcess().GetTarget().GetSectionLoadList().ResolveLoadAddress(pc, pc_addr))
{
    // The address was successfully resolved and now contains a "Section *" that can point
    // you to the Module/ObjectFile/Symbols...
}

The second bit is to provide the DynamicLoader plugin with a method to
synchronize with the runtime linker. A simple strategy is to set a
breakpoint on the entry address for the executable and to parse the
rendezvous structure after the linker has had a chance to populate it.
Hence the following addition:

   //------------------------------------------------------------------
   /// Returns the virtual address of the entry point for this object
   /// file.
   ///
   /// @return
   /// The virtual address of the entry point or
   /// LLDB_INVALID_ADDRESS if an entry point is not defined.
   //------------------------------------------------------------------
   virtual lldb::addr_t
   GetEntryPoint () const { return LLDB_INVALID_ADDRESS; }

The above two methods are enough to get a minimal DynamicLoader plugin
functioning on Linux. I would very much appreciate any
feedback/comments on the above.

I would rather see this added as:

   virtual Address
   GetEntryPoint () const { return Address(); }

So that you know what module the entry point came from. The linux dynamic loader would then just know that the file and load addresses for a linux executable (not shared library, just executable files) are the same (is this true? Do the virtual addresses in the ELF file for an executable never change?) and it could get the entry point as an addr_t by doing:

ModuleSP exe_module_sp (target->GetExecutableModule ());
if (exe_module_sp)
{
  ObjectFile *exe_objfile = exe_module_sp->GetObjectFile();
  if (exe_objfile)
  {
    Address entry_point (exe_objfile->GetEntryPoint());
    addr_t entry_point_load_addr = entry_point.GetFileAddress();
  }
}

The other thing you might want to do, is in the linix dynamic loader plug-in, if you know that the executable is always at the same address as the virtual addresses in the file, you can automatically register the load address of the sections as soon as you construct the DynamicLoaderLinux object. Because when setting a breakoint, it will eventually turn into a section/offset breakpoint and the breakpoint won't set itself until the section gets loaded.

Let me know if you have any questions on this.

Greg Clayton

A few quick questions:

Regaring the:

  virtual lldb_private::Address
  GetImageInfoAddress () { return Address(); }

Is this value not in any symbol in the symbol table in the ELF file? If it is a symbol, and the ELF plug-in could classify the symbol type as a special symbol type (see Symbol.h), then could we do it this way?

Greg Clayton <gclayton@apple.com> writes:
[snip]

The second bit is to provide the DynamicLoader plugin with a method to
synchronize with the runtime linker. A simple strategy is to set a
breakpoint on the entry address for the executable and to parse the
rendezvous structure after the linker has had a chance to populate it.
Hence the following addition:

   //------------------------------------------------------------------
   /// Returns the virtual address of the entry point for this object
   /// file.
   ///
   /// @return
   /// The virtual address of the entry point or
   /// LLDB_INVALID_ADDRESS if an entry point is not defined.
   //------------------------------------------------------------------
   virtual lldb::addr_t
   GetEntryPoint () const { return LLDB_INVALID_ADDRESS; }

The above two methods are enough to get a minimal DynamicLoader plugin
functioning on Linux. I would very much appreciate any
feedback/comments on the above.

I would rather see this added as:

   virtual Address
   GetEntryPoint () const { return Address(); }

So that you know what module the entry point came from. The linux
dynamic loader would then just know that the file and load addresses
for a linux executable (not shared library, just executable files) are
the same (is this true? Do the virtual addresses in the ELF file for
an executable never change?) and it could get the entry point as an
addr_t by doing:

OK, using an Address does make sense. One issue that I have not sorted
out yet is how to deal with position independent executable (so yes, the
virtual addresses in the ELF file for an executable can change in this
case). I will try to have initial support for PIE's in the preliminary
DynamicLoader plugin.

ModuleSP exe_module_sp (target->GetExecutableModule ());
if (exe_module_sp)
{
  ObjectFile *exe_objfile = exe_module_sp->GetObjectFile();
  if (exe_objfile)
  {
    Address entry_point (exe_objfile->GetEntryPoint());
    addr_t entry_point_load_addr = entry_point.GetFileAddress();
  }
}

The other thing you might want to do, is in the linix dynamic loader
plug-in, if you know that the executable is always at the same address
as the virtual addresses in the file, you can automatically register
the load address of the sections as soon as you construct the
DynamicLoaderLinux object. Because when setting a breakoint, it will
eventually turn into a section/offset breakpoint and the breakpoint
won't set itself until the section gets loaded.

Currently I am waiting until the process is launched or attached before
updating the section load lists -- will adjust.

Thanks so much for your input!

Greg Clayton <gclayton@apple.com> writes:

A few quick questions:

Regaring the:

  virtual lldb_private::Address
  GetImageInfoAddress () { return Address(); }

Is this value not in any symbol in the symbol table in the ELF file?
If it is a symbol, and the ELF plug-in could classify the symbol type
as a special symbol type (see Symbol.h), then could we do it this way?

Unfortunately, no, there is no symbol. We do have _DYNAMIC which gives
the start address of the .dynamic section, but we still need to parse
the contents. It seems natural to me to do that parsing/processing in
the ObjectFile class.

However, it may still be useful to export _DYNAMIC as a special symbol
type. I have not thought about it as an option. I will keep it in mind
as I work on the dynamic loader plugin.

Thanks!