Using file-defined registers on Android

Hi all,

I’m attempting to follow the platform definition approach that Greg laid out when attempting to attach to a gdbserver running on an Android device. In particular, Android arm v7a devices (Nexus 10 and Nexus 7).

I went ahead and created a python register definition. I generated the definition file based on referencing these:

svn cat http://llvm.org/svn/llvm-project/lldb/trunk/examples/python/x86_64_linux_target_definition.py
svn cat http://llvm.org/svn/llvm-project/lldb/trunk/examples/python/x86_64_target_definition.py

and the output from using one of gdb’s commands when gdb was attached to the gdbserver:

(gdb) maint print raw-registers

Now I’m attempting to do some debugging with lldb.

I created an app, fired it up on the Android, and attempt to attach to the running process. Since I can debug this app fine remotely with gdb, I believe the basic pipe should be okay.

Here’s what I do on the lldb side. The Android app to be debugged is running at this point.

lldb

set the platform file

(lldb) settings set plugin.process.gdb-remote.target-definition-file /home/tfiala/work/arm-arch/armv7a_linux_target_definition.py

note I tried to use armv7-pc-linux, which said the file didn’t match, and there

doesn’t appear to be an armv7a-pc-linux. Should I be using something else here?

(lldb) target create --arch arm-pc-linux libs/armeabi-v7a/libnative-activity.so

As above, only arm-pc-linux seemed to accept this file. The .so file

is an armv7a-built lib in this case and runs fine on Nexus 7 and 10 devices.

(lldb) file --arch arm-pc-linux libs/armeabi-v7a/libnative-activity.so

Now ready for the connect: the adb redirector to communicate with

gdbserver is localhost:5039

(lldb) gdb-remote 5039

Here’s what I get:

(lldb) thread list
Process 8176 stopped

  • thread #1: tid = 8176, , stop reason = signal SIGTRAP
    (lldb) bt
  • thread #1: tid = 8176, , stop reason = signal SIGTRAP
  • frame #0:

The app itself is still running on the Android device - at least the main thread is. So the listing of it as stopped appears to be incorrect. If I do “(lldb) exit”, it will kill the main thread fwiw, but not nuke the process. I’m not particularly concerned with that piece yet as it might be related to the dual-heritage java/native aspect.

I’ve got the architecture definition file indicating the triple it provides is arm-*-linux (at least, I think). I have no idea if the file is working since I haven’t (yet) figured out how to get output from the loading process.

I’m attaching my architecture definition file and the maintenance dump in case anybody sees something obviously wrong.

Some questions:

  • Am I running the right commands in the right order to connect to a gdbserver where I’m specifying the register information explicitly? Are the target and file commands needed with the architecture file?

  • Why is LLDB telling me the armv7a object files are not valid armv7 files?

  • Is the “pc” part of the arm-pc-linux part right, wrong, or a don’t care for my scenario?

  • Is it the mere fact that I’m attaching remotely good enough for lldb to be using the architecture definition specified with “settings set plugin.process.gdb-remote.target-definition-file …”, or is it keying off of some of the meta data it has (like me specifying the “target create” and “file --arch” commands)?

  • How do I debug python loaded via lldb or get feedback from the lldb python support (e.g. if there’s a syntax error or something else goofy) when running lldb?

I assume I have something really basic wrong at this point since the arch definition file specified seems to make no difference on the output vs. what I see when I attach with lldb without specifying the architecture file.

Thanks for any suggestions and for helping fill in my understanding!

Sincerely,
Todd Fiala

armv7a_linux_target_definition.py (17.9 KB)

gdb-reg-output-armv7a.out (8.9 KB)

Hi Todd,

You can try reading the registers using ‘register read’ command after connecting with the gdbserver. If LLDB shows the register then you will know that it has parsed the file ok. Otherwise the quickest way to find the problem will be to debug LLDB. The python target definition is parsed in ProcessGDBRemote::ParsePythonTargetDefinition().

On x86_64 linux, I use the following command to connect to gdbserver.

(lldb) file ~/demos/act

Current executable set to ‘~/demos/act’ (x86_64).

(lldb) settings set plugin.process.gdb-remote.target-definition-file /home/abidh/work/llvm/src/tools/lldb/examples/python/x86_64_linux_target_definition.py

(lldb) gdb-remote 10000

I also noted that you have retained the following line from x86_64 file. You may want to update it for ARM.

g_target_definition[‘breakpoint-pc-offset’] = -1

Are the target and file commands needed with the architecture file?

The python target definition file is not a substitute for target or file command. I also wonder what happens when you don’t supply arch in the ‘target create’ command. What arch LLDB finds out from the executable file?

  • Is it the mere fact that I’m attaching remotely good enough for lldb to be using the architecture definition specified with “settings set plugin.process.gdb-remote.target-definition-file …”, or is it keying off of some of the meta data it has (like me specifying the “target create” and “file --arch” commands)?

If your target did not supply qRegisterInfo packet (which I think it did not) then LLDB will end up parsing your target definition file.

Regards,

Abid

Regards,

Abid

Thanks, Abid!

Thanks, Abid!

On many targets, breakpoints are implemented using a breakpoint or invalid instruction. The value of ‘pc’ you get after hitting the breakpoint is actually pointing to the instruction after the breakpoint. In this case, you subtract the size of breakpoint instruction from the pc to get the address where the breakpoint was actually placed.

I see, thanks!

Hi all,

I'm attempting to follow the platform definition approach that Greg laid out when attempting to attach to a gdbserver running on an Android device. In particular, Android arm v7a devices (Nexus 10 and Nexus 7).

I went ahead and created a python register definition. I generated the definition file based on referencing these:

svn cat http://llvm.org/svn/llvm-project/lldb/trunk/examples/python/x86_64_linux_target_definition.py
svn cat http://llvm.org/svn/llvm-project/lldb/trunk/examples/python/x86_64_target_definition.py

and the output from using one of gdb's commands when gdb was attached to the gdbserver:

(gdb) maint print raw-registers

Now I'm attempting to do some debugging with lldb.

I created an app, fired it up on the Android, and attempt to attach to the running process. Since I can debug this app fine remotely with gdb, I believe the basic pipe should be okay.

Here's what I do on the lldb side. The Android app to be debugged is running at this point.

lldb

# set the platform file
(lldb) settings set plugin.process.gdb-remote.target-definition-file /home/tfiala/work/arm-arch/armv7a_linux_target_definition.py

# note I tried to use armv7-pc-linux, which said the file didn't match, and there
# doesn't appear to be an armv7a-pc-linux. Should I be using something else here?
(lldb) target create --arch arm-pc-linux libs/armeabi-v7a/libnative-activity.so

# As above, only arm-pc-linux seemed to accept this file. The .so file
# is an armv7a-built lib in this case and runs fine on Nexus 7 and 10 devices.
(lldb) file --arch arm-pc-linux libs/armeabi-v7a/libnative-activity.so

# Now ready for the connect: the adb redirector to communicate with
# gdbserver is localhost:5039
(lldb) gdb-remote 5039

Here's what I get:
(lldb) thread list
Process 8176 stopped
* thread #1: tid = 8176, , stop reason = signal SIGTRAP
(lldb) bt
* thread #1: tid = 8176, , stop reason = signal SIGTRAP
  * frame #0:

It looks like we didn't parse your register definition file correctly. Try a:

(lldb) read registers

I am guessing you will see no output. As already suggested, step through ProcessGDBRemote::ParsePythonTargetDefinition() and make sure this succeeds.

The app itself is still running on the Android device - at least the main thread is. So the listing of it as stopped appears to be incorrect.

Anytime we attach to a GDB server and tell it to attach to a process, the reply to the "vAttach" packet is a stop reply packet which tells us the reason the process is stopped ("TXX" where XX is a signal (SIGTRAP in this case)), and also it tells us about the thread that is stopped and some expedited register values. After the attach packet, we are assuming your program must be stopped. The documentation seems to back this up:

https://sourceware.org/gdb/onlinedocs/gdb/Stop-Reply-Packets.html

So it sounds like the GDB server might not be doing the right thing here? We will need to look at the packet log to see what is going on.

If I do "(lldb) exit", it will kill the main thread fwiw, but not nuke the process. I'm not particularly concerned with that piece yet as it might be related to the dual-heritage java/native aspect.

LLDB will send a "k" packet which tells the remote GDB server to kill the process. The GDB server needs to make sure the process and all its threads are killed. Sounds like a GDB server issue.

I've got the architecture definition file indicating the triple it provides is arm-*-linux (at least, I think). I have no idea if the file is working since I haven't (yet) figured out how to get output from the loading process.

That might be the problem, try to match the architecture exactly to what it is for now. I don't believe I made the wildcard matching work yet.

I'm attaching my architecture definition file and the maintenance dump in case anybody sees something obviously wrong.

Some questions:

* Am I running the right commands in the right order to connect to a gdbserver where I'm specifying the register information explicitly? Are the target and file commands needed with the architecture file?

You only need to run "target create" _or_ the "file" command. "file" is an alias to "target create", so you have to execute one of these commands. You should be able to specify "armv7-pc-linux". I am guessing the "remote-linux" platform is not liking this? It would be worth stepping through the code to see why "armv7-pc-linux" us being rejected and by whom.

* Why is LLDB telling me the armv7a object files are not valid armv7 files?

We don't currently have "armv7a" in our architecture list. Does LLVM/clang understand "armv7a"? I would try using "armv7-pc-linux". If "armv7a" is recognized by LLVM/Clang, feel free to add it by modifying the code in ArchSpec.cpp. There are a few tables you will need to edit:

g_core_definitions on ArchSpec.cpp:50

g_elf_arch_entries on ArchSpec.cpp:228

The main problem we currently have with ELF, is ELF file tell us "ARM" and that is it. Is there a note or anything else inside the ELF file that can help us figure out the exact ARM variant contained in an ELF executable? If so we need to modify ObjectFileELF:

bool
ObjectFileELF::GetArchitecture (ArchSpec &arch)
{
    if (!ParseHeader())
        return false;

    arch.SetArchitecture (eArchTypeELF, m_header.e_machine, LLDB_INVALID_CPUTYPE);
    arch.GetTriple().setOSName (Host::GetOSString().GetCString());
    arch.GetTriple().setVendorName(Host::GetVendorString().GetCString());
    return true;
}

Because just "ARM" really isn't enough. The only thing we have to go on with ELF is the e_machine from the ELF header. If there isn't a way to detect the correct arch variant for ELF is, we will need to add an LLDB setting that can be used to set substitute the correct value in when parsing ARM files and we would use it to change all "ARM" generic architectures to the result of the setting and this would need to be done in ObjectFileELF::GetArchitecture(...). It would be better if there is some data in the ARM ELF files in the object file itself that we can parse though, so we should pursue this angle first.

* Is the "pc" part of the arm-pc-linux part right, wrong, or a don't care for my scenario?

Not sure. The triple should match what LLVM/Clang thinks the standard triple should be for ARM on linux.

* Is it the mere fact that I'm attaching remotely good enough for lldb to be using the architecture definition specified with "settings set plugin.process.gdb-remote.target-definition-file ...", or is it keying off of some of the meta data it has (like me specifying the "target create" and "file --arch" commands)?

It might currently be checking the target arch and trying to match it up to the arch you have in your target definition file. I would remove the wildcard and have it match exactly what you type for now.

* How do I debug python loaded via lldb or get feedback from the lldb python support (e.g. if there's a syntax error or something else goofy) when running lldb?

You really can't debug the python right now as far as I know. print statements are my current choice when things go wrong.

For the target definition file, set a breakpoint in ProcessGDBRemote.cpp in ProcessGDBRemote::ParsePythonTargetDefinition(), which currently is line 334:

bool
ProcessGDBRemote::ParsePythonTargetDefinition(const FileSpec &target_definition_fspec)
{
#ifndef LLDB_DISABLE_PYTHON
    ScriptInterpreter *interpreter = GetTarget().GetDebugger().GetCommandInterpreter().GetScriptInterpreter();
    Error error;
    lldb::ScriptInterpreterObjectSP module_object_sp (interpreter->LoadPluginModule(target_definition_fspec, error));
    if (module_object_sp)
    {
        lldb::ScriptInterpreterObjectSP target_definition_sp (interpreter->GetDynamicSettings(module_object_sp,
                                                                                              &GetTarget(),
                                                                                              "gdb-server-target-definition",
                                                                                              error));
        
        PythonDictionary target_dict(target_definition_sp);

        if (target_dict)
        {

You will want to see that "module_object_sp" is valid and also that "target_definition_sp" and eventually "target_dict" test true. Let me know if they don't.

You might want to verify that you can do some rudimentary python first:

% lldb
(lldb) script 2+3

I assume I have something really basic wrong at this point since the arch definition file specified seems to make no difference on the output vs. what I see when I attach with lldb without specifying the architecture file.

I am guessing that the arch file is not getting loaded due to the architecture having a wildcard? Let me know what you find on that end.

Thanks for all the replies, Greg!

I’m going to work through this today.

LLDB will send a “k” packet which tells the remote GDB server to kill the process. The GDB server needs to make sure the process and all its threads are killed. Sounds like a GDB server issue.

If I do indeed hit strangeness with gdbserver, I may switch gears over to working on lldb-server sooner than later (per other threads on that topic).

If I do indeed hit strangeness with gdbserver, I may switch gears over to working on lldb-server sooner than later (per other threads on that topic).

Make that lldb-gdbserver, rather.

You might want to verify that you can do some rudimentary python first:
% lldb
(lldb) script 2+3

Sure enough that found issue #1 :slight_smile:

For some reason my install process (autotools-based, standard configure, make, make install) is not getting the lldb.py and site-packages copied over to the install tree. I copied these over manually for now to the lldb -P location and at least the sample python script now runs. I’ve put a TODO on my list to check back on the makefiles and see what’s going on with the state of the python package installation.

For some reason my install process (autotools-based, standard configure, make, make install) is not getting the lldb.py and site-packages copied over to the install tree. I copied these over manually for now to the lldb -P location >and at least the sample python script now runs. I’ve put a TODO on my list to check back on the makefiles and see what’s going on with the state of the python package installation.

I noticed this too that make install does not put the python related stuff to the install location in Linux. I was planning to create a patch for it but got distracted. I will probably come back to it sooner if you have not already fixed it by then.

Regards,

Abid

I’ll go ahead and take care of it - no time like the present :slight_smile:

Hi all,

I’m working with Todd on this particular piece (trying to connect to an Android-resident gdbserver with existing lldb), and I have some new information. When lldb sends the ‘g’ command to fetch the register values, gdbserver is apparently sending only the first 328 bytes of the 712 bytes that are expected (based on the armv7a architecture supplied by Todd’s Python settings script (which in turn was derived from the output of gdb’s “maint print raw-registers” command connected to the same gdbserver)). This causes lldb to reject the returned values because the expected length is not correct.

If I modify the Python setting script to declare only the first 328 bytes of registers (everything except the sN and qN registers), then lldb properly accepts the returned packet from gdbserver and apparently is processing it correctly (I did verify that lldb does at least know the correct value in the PC register after doing this).

I checked into what gdb does when it talks to a gdbserver instance. I’m a complete newbie in this code, but it looks like the process_g_packet() function in gdb’s remote.c is tolerant of a returned packet that is smaller than expected; it appears to just mark the registers beyond the set returned as being not “in_g_packet” so that subsequent requests for the values in such registers go through an explicit “p” request to the remote gdbserver.

Assuming this logic is correct, would it be appropriate to do something similar on the lldb side? That is, if the set of register values returned from a ‘g’ query is smaller than expected, would it be appropriate to allow it and somehow mark the following registers as requiring explicit fetches?

Thanks,
Steve

Comments below.

Hi all,

I'm working with Todd on this particular piece (trying to connect to an Android-resident gdbserver with existing lldb), and I have some new information. When lldb sends the 'g' command to fetch the register values, gdbserver is apparently sending only the first 328 bytes of the 712 bytes that are expected (based on the armv7a architecture supplied by Todd's Python settings script (which in turn was derived from the output of gdb's "maint print raw-registers" command connected to the same gdbserver)). This causes lldb to reject the returned values because the expected length is not correct.

If I modify the Python setting script to declare only the first 328 bytes of registers (everything except the sN and qN registers), then lldb properly accepts the returned packet from gdbserver and apparently is processing it correctly (I did verify that lldb does at least know the correct value in the PC register after doing this).

It sounds like you defined the sN and qN registers as actual registers, not as registers whose values are in other registers.

If you take a look at:

svn cat http://llvm.org/svn/llvm-project/lldb/trunk/examples/python/x86_64_target_definition.py

You will see all registers up to "mxcsr" are all actual concrete registers. The others are registers whose value is part of another register and any registers that are in other registers have a 'slice' key value pair, or if a registers is made up of two or more other registers they have a 'composite' key value pair.

If you send me your python file I can take a look. LLDB will expect your g/G packets to take/return all of the concrete registers.

I checked into what gdb does when it talks to a gdbserver instance. I'm a complete newbie in this code, but it looks like the process_g_packet() function in gdb's remote.c is tolerant of a returned packet that is smaller than expected; it appears to just mark the registers beyond the set returned as being not "in_g_packet" so that subsequent requests for the values in such registers go through an explicit "p" request to the remote gdbserver.

If you do have your target definition file setup correctly, then we might need to modify LLDB to deal with smaller than expected g/G packet buffers.

One thing to watch out for is when running expressions on the current thread, we currently assume getting all registers with the 'g' and 'G' packets are enough to save and restore the entire context that is necessary in order to completely save and restore registers for a thread. If this is _not_ the case, then we will need to make sure that both:

GDBRemoteRegisterContext::ReadAllRegisterValues()
GDBRemoteRegisterContext::WriteAllRegisterValues()

"do the right thing".

Assuming this logic is correct, would it be appropriate to do something similar on the lldb side? That is, if the set of register values returned from a 'g' query is smaller than expected, would it be appropriate to allow it and somehow mark the following registers as requiring explicit fetches?

Yes LLDB should be modified to handle it if your target definition file is setup correctly. Send if over my way and I will take a look and let you know which way to proceed.

Greg

Ah. Brilliant. Yes, we have not set up register aliasing correctly in the .py file. Let me just try to fix that and make sure everything works ok.

Thanks,
Steve

There are no "composite" examples in the check in files, but an example key/value pair would be:

'composite' : [ 'rbx', 'rax' ]

The value is an array or register names starting with the most significant bytes on down to the least significant bytes. 'composite' assumes if a register is made up of all other registers, that it is always currently the entire value that those values must be contiguous in the register context buffer ('rbx' and 'rax' would need to have their bytes be contiguous, you can't have 'rax' at offset 0 and 'rbi' at offset 100).

Thanks, Greg! I didn’t know I needed that until I read it, but I see that I’ll need to supply both slice and composite definitions.

Thanks,
Steve