Hi,
I've been toying around with loading libraries and what I can do with lldb, but it seems some of the support isn't there:
- I can load a library from a command, but the only thing I get is a "token" (the return of dlopen());
- I can't (as far as I can tell) know what is the address for the GOT entry for a function (the one that will be changed by the dynamic linker on first invocation, they seem to be in the __DATA,__la_symbol_ptr section), but…
On Mach-o you can see at least the stubs (locations that contain the lazy pointer indirections) as they are marked as "Trampoline" symbols:
(lldb) target modules dump symtab a.out
Symtab, file = /Volumes/work/gclayton/Documents/src/attach/a.out, num_symbols = 18:
Debug symbol
>Synthetic symbol
>>Externally Visible
>>>
Index UserID DSX Type File Address/Value Load Address Size Flags Name
------- ------ --- ------------ ------------------ ------------------ ------------------ ---------- ----------------------------------
[ 0] 0 D SourceFile 0x0000000000000000 Sibling -> [ 4] 0x00640000 /Volumes/work/gclayton/Documents/src/attach/test.c
[ 1] 2 D ObjectFile 0x000000004e440e1e 0x0000000000000000 0x00660001 /Volumes/work/gclayton/Documents/src/attach/test.o
[ 2] 4 D Code 0x0000000100000d80 0x0000000000000070 0x000f0000 sleep_loop
[ 3] 8 D Code 0x0000000100000df0 0x0000000000000066 0x000f0000 main
[ 4] 12 Data 0x0000000100001000 0x0000000000000000 0x000e0000 pvars
[ 5] 13 X Data 0x0000000100001068 0x0000000000000000 0x000f0000 NXArgc
[ 6] 14 X Data 0x0000000100001070 0x0000000000000000 0x000f0000 NXArgv
[ 7] 15 X Data 0x0000000100001080 0x0000000000000000 0x000f0000 __progname
[ 8] 16 X Absolute 0x0000000100000000 0x0000000000000000 0x00030010 _mh_execute_header
[ 9] 17 X Data 0x0000000100001078 0x0000000000000000 0x000f0000 environ
[ 10] 20 X Code 0x0000000100000d40 0x0000000000000000 0x000f0000 start
[ 11] 21 Trampoline 0x0000000100000e56 0x0000000000000006 0x00010100 exit
[ 12] 22 Trampoline 0x0000000100000e5c 0x0000000000000006 0x00010100 getchar
[ 13] 23 Trampoline 0x0000000100000e62 0x0000000000000006 0x00010100 getpid
[ 14] 24 Trampoline 0x0000000100000e68 0x0000000000000006 0x00010100 printf
[ 15] 25 Trampoline 0x0000000100000e6e 0x0000000000000006 0x00010100 puts
[ 16] 26 Trampoline 0x0000000100000e74 0x0000000000000006 0x00010100 sleep
[ 17] 27 X Extern 0x0000000000000000 0x0000000000000000 0x00010100 dyld_stub_binder
The symbols 11 - 16 above are the stub entries for the where all calls to "exit", "getchar", etc are.
- Substituting the address in the GOT wouldn't work. I'll have to turn the original function into a jump to the new one. Nothing is in place for that;
You will need to manually write memory for now, but it should be do-able. You could add some new functions to the ABI plug-ins:
You could add an ABI function to the main ABI.h:
#include "lldb/Target/ABI.h"
virtual bool
ABI::UpdateGOT (const char *func_name, ModuleList *modules, addr_t new_func_addr)
{
return false;
}
Then modify the x86_64 stuff to do the right thing
lldb/source/Plugins/ABI/SysV-x86_64/ABISysV_x86_64.h
lldb/source/Plugins/ABI/SysV-x86_64/ABISysV_x86_64.cpp
If you don't end up overwriting the original function, the "modules" parameters could be nice as you might be able to take over say "print" but only for "a.out" and not other shared libraries. So if "modules" is NULL, then apply the new function to all modules, else, only try and apply it to the modules in the list. Just an idea...
- I found one email from Jason Molenda where he explained how they implemented F&C on gdb (Jason Molenda - Re: Howdy from Apple; Fix and Continue implemented Yet Again ), and am trying to do something similar. But it seems that the current dyld implementation doesn't have a flag to not run global constructors (or re-register ObjC classes), and NSLinkModule was deprecated, so these cases would not.
I wanted to continue this work, but I have some doubts…
There are plenty of issues with all ways of doing things, yes...
How could I get a handle (on my CommandObject) to the library loaded with dlopen? (It can have the same file name as an already loaded library, how can I tell which is which?)
If it is impossible, any ideas on how to add that feature?
Why do you need the handle?
After that, the easy way to replace the functions would be to get the symbols (at least for functions) that are defined in the recently loaded image and turn the current functions into jumps to the new functions.
That is a good way if you don't want to call the original function. I have always wanted to "listen" to the malloc/free calls by making my own versions of malloc/free and do a little data gathering and yet still call through to the original functions.
Hope some of the above hints help.
Greg