This week I have been seeing lldb-server reporting an Illegal Instruction signal when a shared library is loaded while debugging an (ARM) android executable.
I have traced this back to the DYLD Rendezvous breakpoint being inserted as an ARM breakpoint when it should be THUMB.
I have attached a very rough patch that will fix this problem, but I wanted to check with the list to try and find a robust solution that will work in all cases.
Here are the key parts of the problem:
- The rendezvous breakpoint address is not a public symbol so is marked as eAddressClassUnknown.
- When the rendezvous breakpoint address is read from the DT_DEBUG structure it has bit 0 set which marks it as THUMB.
- Once the address is found it calls into Process::CreateBreakpointSite (), which in turn calls Target::GetOpcodeLoadAddress ().
- GetOpcodeLoadAddress() checks if the machine type is ARM and will mask out bit 0, at which point all information that it is a THUMB address has been lost.
- Down the line PlatformLinux::GetSoftwareBreakpointTrapOpcode() will see that bit 0 is not set and so will insert a ARM trap opcode.
My temporary solution was to not strip bit 0 from an address marked as eAddressClassUnknown in GetOpcodeLoadAddress() and also make the check in
GetSoftwareBreakpointTrapOpcode() more relaxed.
I feel this may have repercussions in other scenarios I'm not considering.
Does anyone have any thoughts or advice for a good solution to this problem?
dyld_thumb_fix.patch (2.83 KB)
The issue you are seeing is caused by a separate issue then the one you described. The problem is that the dynamic linker on Android with Lollipop (it should work on Kit Kat and already fixed in AOSP TOT) reports wrong load address for itself (reports 0 instead of the correct load address). It means that we don’t have the right symbol information for the address where we want to set the rendezvous breakpoint what is the root cause for having eAddressClassUnknown at that address. We are already working on a workaround to fetch the load address of the ‘/system/bin/linker’ (and any other .so where we don’t have load address) from the /proc/$pid/maps file as that one is filled in correctly but it will take at least a few days to reach TOT.
After we fix the issue around the linker your issue should go away, but in general I like the idea what you want to achieve although I am not sure your change won’t cause some issue in different obscure cases (e.g. we end up with 2 breakpoint practically at the same location but with different load address). In long term I would like to achieve a point where the user can place a breakpoint in the code where he/she can specify if he/she wants an arm or a thumb breakpoint (useful if lldb can’t decide it automatically for any reason). A viable solution for your problem and for my plan also would be if we set the LSB of the address in case of a thumb breakpoint and only mask it out in the latest possible time (e.g. when writing the breakpoint opcode into the memory).