postmortem debugging (core/minidump) & modules

I’m looking at how the LLDB minidump reader creates the list of modules:

void ProcessMinidump::ReadModuleList() {

const auto file_spec = FileSpec(name.getValue(), true);

ModuleSpec module_spec = file_spec;

Status error;

lldb::ModuleSP module_sp = GetTarget().GetSharedModule(module_spec, &error);

if (!module_sp || error.Fail()) {

continue;

}

}

LLDB currently will insist on finding a local image for the module, which is usually not the case for postmortem debugging on machines different than the the the one where the minidump was created.

I don’t see an obvious way to model modules which have no local image (which is still different than the remote scenario where there is a remote module image), am I missing anything?

Thanks!
Lemo.

I’m looking at how the LLDB minidump reader creates the list of modules:

void ProcessMinidump::ReadModuleList() {

const auto file_spec = FileSpec(name.getValue(), true);

ModuleSpec module_spec = file_spec;

Status error;

lldb::ModuleSP module_sp = GetTarget().GetSharedModule(module_spec, &error);

if (!module_sp || error.Fail()) {

continue;

}

}

LLDB currently will insist on finding a local image for the module, which is usually not the case for postmortem debugging on machines different than the the the one where the minidump was created.

I don’t see an obvious way to model modules which have no local image (which is still different than the remote scenario where there is a remote module image), am I missing anything?

The lldb_private::Platform is responsible for digging up any binaries for a given target, so this code should be grabbing the platform from the target and using that to get the shared module. That way if you create a target that is a minidump on a different system (macOS, linux, etc), the platform would be remote-windows.

That being said “ModuleSpec” should be filled in with more than just the path. It should specify the UUID info from the mini dump that specifies exactly which version the file that you want. That way if the file on the current system exists, it won’t return it unless the path matches. I assume the mini dump has each module’s UUID information? If so, set it. If not the file format assumes you will be running the dumping the same machine and it should be updated to include this information. The platform code can then use this UUID info to possible go and fetch the right version from a UUID database, or how ever the platform wants to provide access to certain binaries. At Apple, we call "dsymForUUID " which, if global defaults were set to point at Apple’s build servers, would go out and download the correct file for us and store it locally in a cache for future use.

So fill in the UUID in ModuleSpec and modify the Platform::GetSharedModule() for your platform to do the right thing is the correct way to go. ProcessMinidump should switch over to using the Platform::GetSharedModule() instead of the target one, or use it after the target one if the target returns an invalid module.

Let me know if you have any more questions,

Greg

Thanks Greg! It makes sense and looking at the code it’s already implemented along those lines: Target::GetSharedModule() defaults to Platform::GetSharedModule() if the initial attempt to get the module fails.

The part I’d like to understand is if there’s a precedence for modules which don’t have any accessible file image (local or remote). Is everything expected to work if we create placeholder Module & ModuleSpecs?
(it seems that the current implementation assumes that we have a file somewhere. Ex. even creating a Module from a ModuleSpec will still try to map the source ModuleSpec to some files).

At Apple, we call "dsymForUUID " which, if global defaults were set to point at Apple’s build servers, would go out and download the correct file for us and store it locally in a cache for future use.

Just curious, what happens if the download fails? Is the corresponding module skipped? (is this strictly about the dSYMs or the same mechanism works for the Mach-O binaries?)

That way if you create a target that is a minidump on a different system (macOS, linux, etc), the platform would be remote-windows.

Not sure if I understand this one, core & minidumps are currently not using any of the the remote debugging machinery, right? Are you suggesting changing that?

Thanks Greg! It makes sense and looking at the code it’s already implemented along those lines: Target::GetSharedModule() defaults to Platform::GetSharedModule() if the initial attempt to get the module fails.

The part I’d like to understand is if there’s a precedence for modules which don’t have any accessible file image (local or remote). Is everything expected to work if we create placeholder Module & ModuleSpecs?

No, it is up to the platform to be able to track down files that don’t exist locally. Most platforms do nothing and will return an empty module shared pointer. We need a file to use or we will just not have any info.

(it seems that the current implementation assumes that we have a file somewhere. Ex. even creating a Module from a ModuleSpec will still try to map the source ModuleSpec to some files).

Yes. Right now with only the path, we will load the file if it exists on disk since no UUID was specified in the ModuleSpec which is really bad and can lead to incorrect info being displayed.

At Apple, we call "dsymForUUID " which, if global defaults were set to point at Apple’s build servers, would go out and download the correct file for us and store it locally in a cache for future use.

Just curious, what happens if the download fails? Is the corresponding module skipped? (is this strictly about the dSYMs or the same mechanism works for the Mach-O binaries?)

It will block until the module is downloaded, and it can and often does fail and returns an error that can be displayed. When we need to download large debug info files, it creates delays with no user interaction and often leads to people wondering what is going on. Not optimal, but it does work if you wait for it.

That way if you create a target that is a minidump on a different system (macOS, linux, etc), the platform would be remote-windows.

Not sure if I understand this one, core & minidumps are currently not using any of the the remote debugging machinery, right? Are you suggesting changing that?

No. Each binary knows how to tell LLDB what target triple it is. PECOFF files will always map to the host windows platform or remote-windows when not on a Windows host computer. If you say “file a.out” and give it a PECOFF file, just do “target list” and see the platform was selected for you. Since the Minidump is specific to Windows, it should select the right platform for you. If it doesn’t we will need to fix that.

Does the mini dump format have the UUID or some sort of checksum of the file in it?

No. Each binary knows how to tell LLDB what target triple it is. PECOFF files will always map to the host windows platform or remote-windows when not on a Windows host computer. If you say “file a.out” and give it a PECOFF file, just do “target list” and see the platform was selected for you. Since the Minidump is specific to Windows, it should select the right platform for you. If it doesn’t we will need to fix that.

Thanks for the clarification. A small side note: yes, the minidump format originates on Windows, but Breakpad/Crashpad use it across all supported platforms (including Linux and macOS).

Does the mini dump format have the UUID or some sort of checksum of the file in it?

Yes, the minidump has both the checksum for modules and UUID for the debug information.

No. Each binary knows how to tell LLDB what target triple it is. PECOFF files will always map to the host windows platform or remote-windows when not on a Windows host computer. If you say “file a.out” and give it a PECOFF file, just do “target list” and see the platform was selected for you. Since the Minidump is specific to Windows, it should select the right platform for you. If it doesn’t we will need to fix that.

Thanks for the clarification. A small side note: yes, the minidump format originates on Windows, but Breakpad/Crashpad use it across all supported platforms (including Linux and macOS).

Ahh. So then hopefully it extracts the triple from the mini dump file and sets it correctly which gets us right platform set?

Does the mini dump format have the UUID or some sort of checksum of the file in it?

Yes, the minidump has both the checksum for modules and UUID for the debug information.

Good to hear.

Thanks Greg! It makes sense and looking at the code it's already
implemented along those lines: Target::GetSharedModule() defaults to
Platform::GetSharedModule() if the initial attempt to get the module fails.

The part I'd like to understand is if there's a precedence for modules
which don't have any accessible file image (local or remote). Is everything
expected to work if we create placeholder Module & ModuleSpecs?

No, it is up to the platform to be able to track down files that don't
exist locally. Most platforms do nothing and will return an empty module
shared pointer. We need a file to use or we will just not have any info.

Just to make sure I understand: we need a file with the debug info (e.g.,
a PDB), but we shouldn't need the actual executable/shared library/DLL file
except on platforms where the debug info is embedded in the binary. Right?

Ahh. So then hopefully it extracts the triple from the mini dump file and sets it correctly which gets us right platform set?

Yes, this part seems to be working fine.

Just to make sure I understand: we need a file with the debug info (e.g., a PDB), but we shouldn’t need the actual executable/shared library/DLL file except on platforms where the debug info is embedded in the binary. Right?

As far as I can tell, LLDB today does need the binary image (executable/shared library).

The minidump has a list of modules and memory ranges, but it’s a bit tricky to map that to LLDB modules and sections: I did a quick experiment with placeholder modules, but the easy patch doesn’t seem it gets us very far (ex. without an image file to parse we don’t get the sections)