RFC: Python callback for Target get module

Problem

When we debug Android target process, LLDB pulls shared library files from the target Android device to resolve the modules in the target process. It is slow.

Detail

First, let’s break down a typical Android debugging setup.

  • A host machine with a connected Android device using USB cable.
    • This host machine runs LLDB.
    • This Android device runs lldb-server and ADB daemon.
    • LLDB communicates with lldb-server via TCP over ADB protocol over USB.
    • LLDB also communicates with ADB daemon over ADB protocol over USB.
  • lldb-server on the target Android device
    • It gets loaded shared libraries from the process memory map.
    • It sends the loaded shared library module info to LLDB.
  • LLDB on the host machine
    • It gets the loaded shared library module info from lldb-server.
    • It pulls all of the shared library files for the target process from the target Android device via ADB protocol on USB.

This ‘pulling all of the shared library files from the device’ is slow. Of course, this pulling processing time length depends on the target process. How many shared libraries the process uses, and how large the shared libraries are. But typical large application processes take several minutes to finish. The pulling processing is non-concurrent operation, and even with recent generation USB transfer is usually slower than local file system.

The platform module caching works really well to reduce this ‘pulling all of the shared library files from the device’ processing time. LLDB does not need to pull the shared library files from the device again once the files are cached by the module cache. Although, obviously, the module cache will not work for the first debugging session, only works for the second and subsequent debugging sessions.

Proposal

I’d like to propose a new feature to address this problem.

A Python callback for Target get module.

This new feature has negligible performance impact when not used.

When it is used, this Python callback will work as the implementation for getting modules in the Target GetOrCreateModule. If the callback fails, or something goes wrong, GetOrCreateModule fallbacks to continue to use the LLDB implementation for getting modules. If the callback succeeds to return a module file and/or a symbol file, GetOrCreateModule will use it in the same way with the LLDB implementation.

This will unblock users to write their own module caching system for LLDB. This is easy and simple, but powerful. Not only for bypassing the pulling shared library files from the device, but it unblocks extra capabilities. For example,

  • When we are debugging an Android application that is built with a build system, this Python callback can leverage the build system artifacts. It can pick up a shared library from the build system artifacts using the module spec UUID for the module that LLDB is looking for. It can provide the exact same shared library files with what the target Android device has. Therefore LLDB does not need to pull those files from the device. Also the callback can provide the symbol file along with the module file, while the device module files are stripped.
  • When we are debugging an Android application and a symbol server retains the symbols of the application, this Python callback can download the symbol files using the module spec UUID from the symbol server on demand.

Users will be able to use a new SBDebugger API to set the callback function. The reason of why it will be in SBDebugger and not in SBTarget is because this callback needs to be set before creating Target with a process or attaching Platform to a process in order to work from the beginning of the Target instance.

def get_module_callback_function(
    debugger: lldb.SBDebugger,
    module_spec: lldb.SBModuleSpec,
    module_file_spec: lldb.SBFileSpec,
    symbol_file_spec: lldb.SBFileSpec
):
    # module caching implementation.
    # and return a module file and/or a symbol file. 
    module_file_spec.SetDirectory(module_file_dir)
    module_file_spec.SetFilename(module_file_name)
    symbol_file_spec.SetDirectory(symbol_file_dir)
    symbol_file_spec.SetFilename(symbol_file_name)
    return SBError()

debugger.SetTargetGetModuleCallback(get_module_callback_function)

Performance example: a clear win for a large application

Without the callback: 124 seconds to finish attaching LLDB to the process. The vast majority of the time spent was pulling the shared library files from the device. And no symbols for the pulled shared library files because the application only contained stripped shared library files.

With the callback: 16 seconds to finish attaching LLDB to the process. The callback got rid of the shared library pulling time spent. And the bonus point was that the callback also provided the symbols for the shared library files.

Implementation

For what it’s worth, on iOS we have a similar situation where the binaries in the process are not available at the same filepaths on the debug-host, so lldb would read them out of memory and it is very slow. In our case, we have an external mechanism that (essentially) copies all the libraries up to the debug-host and we have a Platform “SDK” directory in PlatformRemoteiOS and lldb will look at those local copy directories for the full filepath and compare UUIDs to see if they match. (every different OS build number for the iOS devices gets its own “DeviceSupport” directory, so lldb may have a dozen different binaries that match the FileSpec in the SDK, and it needs to know which is the correct one.)

My first reaction to the idea of adding a python scripting hook is that I’m not opposed to it, but I wanted to point out how we solved this problem for Apple remote devices - platform SDKs.

It seems wrong that this should be a debugger property. One SBDebugger can have many diverse targets. For instance, I could be simultaneously debugging a local and a remote program (maybe to trace RPC between the two). These two sessions might very well need different “module finders”.

I think it would be better to arrange to pass this in to target creation.

Thank you for the suggestion.
To address the multiple targets use-case, how about to use SBPlatform method instead of SBDebugger?

# platform: lldb.SBPlatform
platform.SetTargetGetModuleCallback(callback)

Target instance always retains PlatformSP set by the Target constructor or SetPlatform. Therefore, if Platform instance retains the callback that is set by the SBPlatform method, Target instance could use it all the time.

Finding the files associated with something you’re going to debug seems like an appropriate job for the Platform. That sounds fine to me.

Jim

On Jun 26, 2023, at 10:17 AM, Kazuki Sakamoto via LLVM Discussion Forums notifications@llvm.discoursemail.com wrote:

splhack
June 26

Thank you for the suggestion.
To address the multiple targets use-case, how about to use SBPlatform method instead of SBDebugger?

# platform: lldb.SBPlatform
platform.SetTargetGetModuleCallback(callback)

Target instance always retains PlatformSP set by the Target constructor or SetPlatform. Therefore, if Platform instance retains the callback that is set by the SBPlatform method, Target instance could use it all the time.


Visit Topic or reply to this email to respond.

To unsubscribe from these emails, click here.

The diffs are committed. Users now can use SBPlatform SetLocateModuleCallback to set ‘locate module callback’ (renamed from ‘get module callback’).

def locate_module(
    module_spec: lldb.SBModuleSpec,
    module_file_spec: lldb.SBFileSpec,
    symbol_file_spec: lldb.SBFileSpec,
) -> lldb.SBError:
     name = module_spec.GetFileSpec().GetFilename()
     uuid = bytes(
        ctypes.pointer(
            (ctypes.c_char * module_spec.GetUUIDLength()).from_address(
                int(module_spec.GetUUIDBytes())
            )
        ).contents
    ).hex()

    # Implement module locating logic with module_spec arg.
    # e.g. find module file and symbol file from cache with UUID.

    if module_dir and module_file:
        module_file_spec.SetDirectory(module_dir)
        module_file_spec.SetFilename(module_file)

    if symbol_dir and symbol_file:
        symbol_file_spec.SetDirectory(symbol_dir)
        symbol_file_spec.SetFilename(symbol_file)

    return lldb.SBError()

platform.SetLocateModuleCallback(locate_module)