[RFC] lldb integration with (user mode) qemu

Hello everyone,

I'd like to propose a new plugin for better lldb+qemu integration.

As you're probably aware qemu has an integrated gdb stub. Lldb is able
to communicate with it, but currently this process is somewhat tedious.
One has to manually start qemu, giving it a port number, and then
separately start lldb, and have it connect to that port.

The chief purpose of this feature would be to automate this behavior,
ideally to the point where one can just point lldb to an executable,
type "run", and everything would just work. It would take the form of a
platform plugin (PlatformQemuUser, perhaps). This would be a non-host,
always-connected plugin, and it's heart would be the DebugProcess
method, which would ensure the emulator gets started when the user wants
to start debugging. It would operate the same way as our host platforms
do, except that it would start qemu instead of debug/lldb-server. Most
of the other methods would be implemented by delegating to the host
platform (as the process will be running on the host), possibly with
some minor adjustments like prepending sysroot to the paths, etc. (My
initial proof-of-concept implementation was 200 LOC.)

The plugin would be configured via multiple settings, which would let
the user specify, the path to the emulator, the kind of cpu it should
emulate and the path to the system libraries, and any other arguments
that the user wishes to pass to the emulator. The user could then
configure it in their lldbinit file to match their system setup.

The needs of this plugin should match the existing Platform abstraction
fairly well, so I don't anticipate (*) the need to add new entry points
or modify existing ones. There is one tricky aspect which I see, and it
relates to platform selection. Our current platform selection code gives
each platform instance (while preferring the current platform) a chance
to "claim" an executable, and aborts if the choice is ambiguous. The
introduction of a qemu platform would introduce such an ambiguity, since
(when running on a linux host) a linux executable would be claimed by
both the qemu plugin and the existing remote-linux platform. This would
prevent "target create arm-linux.exe" from working out-of-the-box.

To resolve this, I'd like to create some kind of a mechanism to give
preference to some plugin. This could either be something internal,
where a plugin indicates "strong" preference for an executable (the qemu
platform could e.g. do this when the user sets the emulator path, the
remote platform when it is connected), or some external mechanism like a
global setting giving the preferred platform order. I'd very much like
hear your thoughts on this.

I'm also not sure how to handle the case of multiple emulated
architectures. Qemu can emulate any processor architecture (of those
that lldb supports, anyway), but the path to the emulator, sysroot, and
probably other settings as well are going to be different. I see two
possible ways to go about this:

a) have just a single set of settings, effectively limiting the user to
emulating just a single architecture per session. While it would most
likely be enough for most use cases, this kind of limitation seems
artificial. It would also likely require the introduction of another
setting, which would specify which architecture the plugin should
actually emulate (and return from GetSupportedArchitectureAtIndex,
etc.). On the flip side, this would be consistent with the how our
remote-plugins work, although there it is given by the need to connect
to something, and the supported architecture is then determined by the
remote machine.

b) have multiple platform instances for each architecture. This solution
be a more general solution, but it would mean that our "platform list"
output would double, and half of it would consist of qemu platforms

As far as testing is concerned I'm planning to reuse parts of our
gdb-client test suite for this. Namely, I want to write a small python
script which would act as a fake emulator. It would be sending out
pre-programmed gdb-remote responses, much like our client test suite
does. Since the main purpose of this is to validate that the emulator
was started with the correct arguments, I don't expect the need for
emulating any complex behavior -- the existing client classes should
completely suffice.

If you got all the way here, I want to thank you for taking your time to
read this, and urge you to let me know what you think.

regards,
pavel

(*) There is one refactor of the Platform class implementations that I'd like to do first, but this (a) is not strictly necessary for this; and (b) is valueable independently of this RFC; so I am leaving that for a separate discussion.

Glad to hear the gdb server in qemu plays nicely with lldb. Perhaps
some of that is the compatibility work that has been going on.

The introduction of a qemu platform would introduce such an ambiguity, since (when running on a linux host) a linux executable would be claimed by both the qemu plugin and the existing remote-linux platform. This would prevent "target create arm-linux.exe" from working out-of-the-box.

I assume you wouldn't get a 3 way tie here because in connecting to a
remote-linux you've "disconnected" the host platform, right?

To resolve this, I'd like to create some kind of a mechanism to give preference to some plugin.

This choosing of plugin, does it mostly take place automatically at
the moment or is there a good spot where we could say "X and Y could
load this file, please choose one/resolve the tie"?

My first thought for automatic resolve is a native/emulator/remote
sort of hierarchy if you were going to order them. (with some nice
message "preferring X to Y because..." when it starts up)

a) have just a single set of settings, effectively limiting the user to emulating just a single architecture per session. While it would most likely be enough for most use cases, this kind of limitation seems artificial.

One aspect here is the way you configure them if you want to use many
architectures of qemu-user.

If I have only one platform, I set qemu-user.foo to some Arm focused
value. Then if I want to work on AArch64 I edit my lldbinit to switch
it. (or have many init files)
If there's one platform per arch I can set qemu-arm.foo and qemu-aarch64.foo.

Not much between them without having a specific use case for it. You
could work around either in various ways.

Wouldn't most of the platform entries just be subclasses of some
generic qemu-user-platform? So code wise it wouldn't be that much
extra to add them.
You could say it's bad to list qemu-xyz-platform when that isn't
installed, but then again, lldb lists a "local Mac OSX user platform
plug in" even on Linux. So not a big deal.
(and an apt install of qemu-user gives me every arch so easy to fix)

And you have to handle the ambiguity issue either way.

Thanks for reading this. Responses inline.

Glad to hear the gdb server in qemu plays nicely with lldb. Perhaps
some of that is the compatibility work that has been going on.

The introduction of a qemu platform would introduce such an ambiguity, since (when running on a linux host) a linux executable would be claimed by both the qemu plugin and the existing remote-linux platform. This would prevent "target create arm-linux.exe" from working out-of-the-box.

I assume you wouldn't get a 3 way tie here because in connecting to a
remote-linux you've "disconnected" the host platform, right?

IIUC, the host platform is not consulted at this step. It can only be claim an executable when it is selected as the "current" platform, because the current platform is consulted first. (And this is what happens in most "normal" debug sessions.)

So there wouldn't be a three-way tie, but if you actually wanted to debug a native executable under qemu, you would have to explicitly select the qemu platform. This is the same thing that already happens when you want to debug a native executable remotely, but there it's kind of expected because you need to connect to the remote machine anyway.

To resolve this, I'd like to create some kind of a mechanism to give preference to some plugin.

This choosing of plugin, does it mostly take place automatically at
the moment or is there a good spot where we could say "X and Y could
load this file, please choose one/resolve the tie"?

This currently happens in TargetList::CreateTargetInternal, and one cannot create a prompt there, as that code is also used by the non-interactive paths (SBDebugger::CreateTarget, for instance). But I like the idea, and it may not be too difficult to refactor this to make that work. (I am imagining changing this code to use llvm::Error, and then creating a special AmbiguousPlatformError type, which could get caught by the command line code and transformed into a prompt.)

My first thought for automatic resolve is a native/emulator/remote
sort of hierarchy if you were going to order them. (with some nice
message "preferring X to Y because..." when it starts up)

Do you mean like, each platform would advertise its kind (host/emulator/remote), and the relative kind priorities would be hardcoded in lldb?

a) have just a single set of settings, effectively limiting the user to emulating just a single architecture per session. While it would most likely be enough for most use cases, this kind of limitation seems artificial.

One aspect here is the way you configure them if you want to use many
architectures of qemu-user.

If I have only one platform, I set qemu-user.foo to some Arm focused
value. Then if I want to work on AArch64 I edit my lldbinit to switch
it. (or have many init files)
If there's one platform per arch I can set qemu-arm.foo and qemu-aarch64.foo.

Yes, those were my thoughts as well, but I am unsure how often would that occur in practice (I'm pretty sure I'll need to care for only one arch for my use case).

Not much between them without having a specific use case for it. You
could work around either in various ways.

Wouldn't most of the platform entries just be subclasses of some
generic qemu-user-platform? So code wise it wouldn't be that much
extra to add them.

Yeah, it's possible they wouldn't even be actual classes, just different instances of the same class.

You could say it's bad to list qemu-xyz-platform when that isn't
installed, but then again, lldb lists a "local Mac OSX user platform
plug in" even on Linux. So not a big deal.

Yeah, I don't think it's a big deal either. The reason I'm asking this is to try to create a consistent experience. For example, we have a bunch of PlatformApple{Watch,TV,...}{Remote,Simulator} platforms (only available on apple hosts). These don't differ in architectures, but they do differ in the environment part of the triples, so you (almost) have a one-to-one mapping between triples and architectures.

However, they're also automatically configured (based on the xcode installation), and they don't create ambiguities (simulators have separate triples), so I'm not sure what kind of parallels to draw from that.

pl

So there wouldn't be a three-way tie, but if you actually wanted to debug a native executable under qemu, you would have to explicitly select the qemu platform. This is the same thing that already happens when you want to debug a native executable remotely, but there it's kind of expected because you need to connect to the remote machine anyway.

Since we already have the host vs remote with native arch situation,
is it any different to ask users to do "platform select qemu-user" if
they really want qemu-user? Preferring host to qemu-user seems
logical.
For non native it would come up when you're currently connected to a
remote but want qemu-user on the host. So again you explicitly select
qemu-user.

Does that solve all the ambiguous situations?

Do you mean like, each platform would advertise its kind (host/emulator/remote), and the relative kind priorities would be hardcoded in lldb?

Yes. Though I think that opens more issues than it solves. Host being
higher priority than everything else seems ok. Then you have to think
about how many emulation/connection hops each one has, but sometimes
that's not the metric that matters. E.g. an armv7 file on a Mac would
make more sense going to an Apple Watch simulator than qemu-user.

Yes, those were my thoughts as well, but I am unsure how often would that occur in practice (I'm pretty sure I'll need to care for only one arch for my use case).

Seems like starting with a single "qemu-user" platform is the way to
go for now. When it's not configured it just won't be able to claim
anything.

The hypothetical I had was shipping a development kit that included
qemu-arch1 and qemu-arch2. Would you rather ship one init file that
can set all those settings at once (since each one has its own
namespace) or symlink lldb-arch1 to be "lldb -s <init with settings
for arch1>". However anyone who's looking at shipping lldb has control
of the sources so they could make their own platform entries. Or
choose a command line based on an IDE setting.

So there wouldn't be a three-way tie, but if you actually wanted to debug a native executable under qemu, you would have to explicitly select the qemu platform. This is the same thing that already happens when you want to debug a native executable remotely, but there it's kind of expected because you need to connect to the remote machine anyway.

Since we already have the host vs remote with native arch situation,
is it any different to ask users to do "platform select qemu-user" if
they really want qemu-user? Preferring host to qemu-user seems
logical.

It does. I am perfectly fine with preferring host over qemu-user.

For non native it would come up when you're currently connected to a
remote but want qemu-user on the host. So again you explicitly select
qemu-user.

Does that solve all the ambiguous situations?

I don't think it does. Or at least I'm not sure how do you propose to solve them (who is "you" in the paragraph above?).

What currently happens is that when you open a non-native (say, linux) executable, the appropriate remote platform gets selected automatically.
$ lldb aarch64/bin/lldb
(lldb) target create "aarch64/bin/lldb"
Current executable set to 'aarch64/bin/lldb' (aarch64).
(lldb) platform status
   Platform: remote-linux
  Connected: no

That happens because the remote-linux platform unconditionally claims the non-native executables (well.. it claims all of them, but it is overridden by the host platform for native ones). It does not check whether it is connected or anything like that.

And I think that behavior is fine, because for a lot of actions you don't actually need to connect to anything. For example, you usually don't connect anywhere when inspecting core files (though you can do that, and it would mean lldb can download relevant shared libraries). And you can always connect at a later time, if needed.

Now the question is what should the new platform do. If it followed the remote-linux pattern, it would also claim those executables unconditionally, we would always have a conflict (*).

Or, it can try to be a bit less greedy and claim an executable only when it is configured. That would mean that in a clean state, everything would behave as it. However, the conflict would reappear as soon as the platform is configured (which will be always, for our users). The idea behind this (sub)feature was that there would be a way to configure lldb so that the qemu plugin comes out on top (of remote-linux, not host).

If we do have a prompt, then this may not be so critical, though I expect that most users would still prefer it we automatically selected qemu.

Do you mean like, each platform would advertise its kind (host/emulator/remote), and the relative kind priorities would be hardcoded in lldb?

Yes. Though I think that opens more issues than it solves. Host being
higher priority than everything else seems ok. Then you have to think
about how many emulation/connection hops each one has, but sometimes
that's not the metric that matters. E.g. an armv7 file on a Mac would
make more sense going to an Apple Watch simulator than qemu-user.

Yes, those were my thoughts as well, but I am unsure how often would that occur in practice (I'm pretty sure I'll need to care for only one arch for my use case).

Seems like starting with a single "qemu-user" platform is the way to
go for now. When it's not configured it just won't be able to claim
anything.

The hypothetical I had was shipping a development kit that included
qemu-arch1 and qemu-arch2. Would you rather ship one init file that
can set all those settings at once (since each one has its own
namespace) or symlink lldb-arch1 to be "lldb -s <init with settings
for arch1>". However anyone who's looking at shipping lldb has control
of the sources so they could make their own platform entries. Or
choose a command line based on an IDE setting.

Yes, that's the hypothetical I had in mind too. I don't think we will be doing it, but I can imagine _somebody_ wanting to do it.

pl

So there wouldn't be a three-way tie, but if you actually wanted to debug a native executable under qemu, you would have to explicitly select the qemu platform. This is the same thing that already happens when you want to debug a native executable remotely, but there it's kind of expected because you need to connect to the remote machine anyway.

Since we already have the host vs remote with native arch situation,
is it any different to ask users to do "platform select qemu-user" if
they really want qemu-user? Preferring host to qemu-user seems
logical.

It does. I am perfectly fine with preferring host over qemu-user.

For non native it would come up when you're currently connected to a
remote but want qemu-user on the host. So again you explicitly select
qemu-user.

Does that solve all the ambiguous situations?

I don't think it does. Or at least I'm not sure how do you propose to solve them (who is "you" in the paragraph above?).

What currently happens is that when you open a non-native (say, linux) executable, the appropriate remote platform gets selected automatically.
$ lldb aarch64/bin/lldb
(lldb) target create "aarch64/bin/lldb"
Current executable set to 'aarch64/bin/lldb' (aarch64).
(lldb) platform status
Platform: remote-linux
Connected: no

That happens because the remote-linux platform unconditionally claims the non-native executables (well.. it claims all of them, but it is overridden by the host platform for native ones). It does not check whether it is connected or anything like that.

And I think that behavior is fine, because for a lot of actions you don't actually need to connect to anything. For example, you usually don't connect anywhere when inspecting core files (though you can do that, and it would mean lldb can download relevant shared libraries). And you can always connect at a later time, if needed.

Now the question is what should the new platform do. If it followed the remote-linux pattern, it would also claim those executables unconditionally, we would always have a conflict (*).

I meant to add an explanation for this asterisk. I was going to say that in the current setup, I believe we would just choose whichever platform comes first (which is the first platform to get initialized), but that is not that great -- ideally, our behavior should not depend on the initialization order.

Or, it can try to be a bit less greedy and claim an executable only when it is configured. That would mean that in a clean state, everything would behave as it. However, the conflict would reappear as soon as the platform is configured (which will be always, for our users). The idea behind this (sub)feature was that there would be a way to configure lldb so that the qemu plugin comes out on top (of remote-linux, not host).

If we do have a prompt, then this may not be so critical, though I expect that most users would still prefer it we automatically selected qemu.

I also realized that implementing the prompt for the case where the executable is specified on the command line will be a bit tricky, because at that lldb hasn't gone interactive yet. I don't think there's any reason why it shouldn't prompt a user in this case, but doing it may require refactoring some of our startup code.

I don't think it does. Or at least I'm not sure how do you propose to solve them (who is "you" in the paragraph above?).

I tend to use "you" meaning "you or I" in hypotheticals. Same thing as
"if I had" but for whatever reason I phrase it like that to include
the other person, and it does have its ambiguities.

What I was proposing is, if I was correct (which I wasn't) then having
the user "platform select qemu-user" would solve things. (which it
doesn't)

What currently happens is that when you open a non-native (say, linux) executable, the appropriate remote platform gets selected automatically.

...because of this. I see where the blocker is now. I thought remote
platforms had to be selected before they could claim.

If we do have a prompt, then this may not be so critical, though I expect that most users would still prefer it we automatically selected qemu.

Seems reasonable to put qemu-user above remote-linux. Only claiming if
qemu-user has been configured sufficiently. I guess architecture would
be the minimum setting, given we can't find the qemu binary without
it.

Is this similar in any way to how the different OS remote platforms
work? For example there is a remote-linux and a remote-netbsd, is
there enough information in the program file itself to pick just one
or is there an implicit default there too?
(I see that platform CreateInstance gets an ArchSpec but having
trouble finding where that comes from)

I don't think it does. Or at least I'm not sure how do you propose to solve them (who is "you" in the paragraph above?).

I tend to use "you" meaning "you or I" in hypotheticals. Same thing as
"if I had" but for whatever reason I phrase it like that to include
the other person, and it does have its ambiguities.

What I was proposing is, if I was correct (which I wasn't) then having
the user "platform select qemu-user" would solve things. (which it
doesn't)

Great, thanks for clarifying.

If we do have a prompt, then this may not be so critical, though I expect that most users would still prefer it we automatically selected qemu.

Seems reasonable to put qemu-user above remote-linux. Only claiming if
qemu-user has been configured sufficiently. I guess architecture would
be the minimum setting, given we can't find the qemu binary without
it.

Yeah, I think we can start with that.

Is this similar in any way to how the different OS remote platforms
work? For example there is a remote-linux and a remote-netbsd, is
there enough information in the program file itself to pick just one
or is there an implicit default there too?

This is actually one of the pain points in lldb. The overall design assumes that you can precisely identify the platform(triple) that the file is meant to be run on by looking at the object file. This is definitely true on Apple platforms (where lldb originated) as even the "simulator" binaries have their own triples.

The situation is more fuzzy in the elf world. TTe *bsd oses have (and use) a ELFOSABI_ constant to identify the binary. Linux uses ELFOSABI_NONE even though there is a dedicated constant it could use (there's probably an interesting story in there). This makes it hard to positively identify a file as a linux binary, but we can mostly get away with it because there's just one OS like that. Having some mechanism to resolve ambiguities might also help with that.

I'm also not sure how much do the OSes actually validate the contents of the elf headers. I wouldn't be surprised if one could create "polyglot" elf binaries that can run on multiple operating systems.

(I see that platform CreateInstance gets an ArchSpec but having
trouble finding where that comes from)

It gets called from TargetList::CreateTargetInternal->Platform::CreateTargetForArchitecture->Platform::Create. There may be other callers, but I think this is the relevant one.

pl

Yeah, I think we can start with that.

No need to consider this now but it could easily be adapted to
qemu-system as well. Spinning up qemu-system for Cortex-M debug might
be a future use case. Once you've got a "run this program and connect
to this port" platform you can sub in almost anything that talks GDB.

Having some mechanism to resolve ambiguities might also help with that.

Cool, I figured someone would have thought about it on the ELF side.
So as long as Linux remains the standout things work ok.

Most importantly, the way it's currently handled doesn't contradict
anything you want to do here.

Please make sure you don't forget that bsd-user also exists (and after
living in a fork for many years for various boring reasons is in the
middle of being upstreamed), so don't tie it entirely to remote-linux.

Jess

I actually did consider this, but it was not clear to me how this would tie in to the rest of lldb. The "run qemu and connect to it" part could be reused, of course, but what else? What would be the "executable" that we "run" in system mode. Is it the kernel image? Disk image?

I have a feeling there wouldn't be much added value in this "platform" over say a python command which implements the start-up dance. OTOH, a proper user-mode platform enables one to hook in to all the usual lldb goodies like specifying the application's command line arguments, environment variables, can help with locating shared libraries, etc.

pl

I am. In fact one of the reason's I haven't started putting up patches yet is because I'm trying to figure out the best way to handle this. :slight_smile:

My understanding is (let me know if I'm wrong) is that user-mode qemu can emulate a different arhitecture, but not a different os. So, the idea is that the "qemu" platform would forward all operations that don't need special handling to the "host" platform. That would mean you get freebsd behavior when running on freebsd, etc.

pl

I actually did consider this, but it was not clear to me how this would tie in to the rest of lldb.
The "run qemu and connect to it" part could be reused, of course, but what else?

That part seems like a good start. I'm sure a lot of other things
would break/not work like you said but if I was shipping a modified
lldb anyway maybe I'd put the effort in to make it work nicely.

Again not something this work needs to consider. Just me relating the
idea to something I have more experience with and has some parallels
with the qemu-user idea.

For anyone following along, I have now posted the first patch for this feature here: <Login.

pl

Hello everyone,

I'd like to propose a new plugin for better lldb+qemu integration.

As you're probably aware qemu has an integrated gdb stub. Lldb is able
to communicate with it, but currently this process is somewhat tedious.
One has to manually start qemu, giving it a port number, and then
separately start lldb, and have it connect to that port.

The chief purpose of this feature would be to automate this behavior,
ideally to the point where one can just point lldb to an executable,
type "run", and everything would just work. It would take the form of a
platform plugin (PlatformQemuUser, perhaps). This would be a non-host,
always-connected plugin, and it's heart would be the DebugProcess
method, which would ensure the emulator gets started when the user wants
to start debugging. It would operate the same way as our host platforms
do, except that it would start qemu instead of debug/lldb-server. Most
of the other methods would be implemented by delegating to the host
platform (as the process will be running on the host), possibly with
some minor adjustments like prepending sysroot to the paths, etc. (My
initial proof-of-concept implementation was 200 LOC.)

The plugin would be configured via multiple settings, which would let
the user specify, the path to the emulator, the kind of cpu it should
emulate and the path to the system libraries, and any other arguments
that the user wishes to pass to the emulator. The user could then
configure it in their lldbinit file to match their system setup.

Yeah, I would create a "PlatformQemuEmulator" and allow multiple instances of this to be created. The setup for the architecture would then happen during the "platform connect" command. The "platform connect" command has different options for each platform, so you can customize the platform connect options to make sense for QEMU. Something like:

(lldb) platform select qemu-emulator
(lldb) platform connect --arch arm64 --sysroot /path/to/arm64/qemu/sysroot --emulator-path /path/to/arm64/emulator ...

The needs of this plugin should match the existing Platform abstraction
fairly well, so I don't anticipate (*) the need to add new entry points
or modify existing ones.

Totally fine to add new virtual functions as needed if necessary.

There is one tricky aspect which I see, and it
relates to platform selection. Our current platform selection code gives
each platform instance (while preferring the current platform) a chance
to "claim" an executable, and aborts if the choice is ambiguous. The
introduction of a qemu platform would introduce such an ambiguity, since
(when running on a linux host) a linux executable would be claimed by
both the qemu plugin and the existing remote-linux platform. This would
prevent "target create arm-linux.exe" from working out-of-the-box.

To resolve this, I'd like to create some kind of a mechanism to give
preference to some plugin. This could either be something internal,
where a plugin indicates "strong" preference for an executable (the qemu
platform could e.g. do this when the user sets the emulator path, the
remote platform when it is connected), or some external mechanism like a
global setting giving the preferred platform order. I'd very much like
hear your thoughts on this.

Seems like selecting the platform first and then connecting to it, and specifying the architecture in the "platform connect --arch <triple" would allow the current QEMU platform to either accept the next:

(lldb) file arm-linux.exe

if the arch matches the currently selected platform for QEMU, or rejecting it of the architecture is wrong.

I'm also not sure how to handle the case of multiple emulated
architectures. Qemu can emulate any processor architecture (of those
that lldb supports, anyway), but the path to the emulator, sysroot, and
probably other settings as well are going to be different. I see two
possible ways to go about this:

a) have just a single set of settings, effectively limiting the user to
emulating just a single architecture per session. While it would most
likely be enough for most use cases, this kind of limitation seems
artificial. It would also likely require the introduction of another
setting, which would specify which architecture the plugin should
actually emulate (and return from GetSupportedArchitectureAtIndex,
etc.). On the flip side, this would be consistent with the how our
remote-plugins work, although there it is given by the need to connect
to something, and the supported architecture is then determined by the
remote machine.

b) have multiple platform instances for each architecture. This solution
be a more general solution, but it would mean that our "platform list"
output would double, and half of it would consist of qemu platforms

I would vote for one platform plug-in that can have multiple instances created where each instance is configured by a call to "platform connect". Then you can do:

(lldb) platform select qemu-emulator
(lldb) platform connect --arch arm64 --sysroot /path/to/arm64/qemu/sysroot --emulator-path /path/to/arm64/emulator ...
(lldb) platform select qemu-emulator
(lldb) platform connect --arch x86_64 --sysroot /path/to/x86_64/qemu/sysroot --emulator-path /path/to/x86_64/emulator ...
(lldb) file arm-linux-arm64.exe
(lldb) file arm-linux-x86_64.exe

And each one would find the platforms that were created with some sort of precedence being needed to be added to LLDB. So maybe we keep a stack of most recently created selected platform instances and always check them in order when a new target is created. In the above case after the two platform select calls, we would have a stack like:

[0] qemu-emulator (x86_64)
[1] qemu-emulator (arm64)
[2] host

And each new target would run through the platforms in order of most recently selected first. And we would need to be able to check all instances of a given platform as each one may or may not be compatible with each new target that is created.

I would also vote for such binaries to be marked in a way that would allow LLDB to auto select the right platform. So if the arm-linux.exe binary had some ELF notes inside of it that could specify the details of the QEMU needed, then we can just do:

$ lldb
(lldb) file arm-linux.exe

The details in the ELF file would help us to know that we wanted to use the QEMU platform, by making sure that the triple that we extract from the ObjectFileELF has QEMU specified in the environment. So the triple could be something like:

arm64-gnu-linux-qemu

The ELF file knows how to dig through ELF files and look at the OSABI in the ELF header + ELF notes to help refine the triple. Darwin binaries do this really well for macOS, iOS, watchOS and tvOS. Each mach-o file has enough info to create a unique triple that allows us to just create a target and we will select the right platform.

Sorry about the delay. I am still working on this, but I’ve been approaching it from the “multiple platforms” angle/thread.

Given the direction that the Multiple platforms with the same name thread is going, I think this direction makes perfect sense.
I am not sure about the usage of platform connect though. It kinda makes sense, but I think it is stretching the concept of “connecting”. Given that we already have the platform settings command, I think it would be more natural to configure these well… settings using that. The only setting/argument we support right now is --working-dir, but we could add a for a platform to expose arbitrary settings using this command

How does that sound?

Something like this might make sense, but I’ve actually found that the current platform selection behavior is quite sufficient for my use case. Or, to put it another way: I currently have bigger problems than platform selection. I may revisit this idea if it gets to the top of my stack.

I’m afraid this wouldn’t work (for us). The entire concept of qemu emulation is based on the idea that you can just take a random binary that can run natively on a foreign architecture, and then run it inside the emulator.

Internally, we could recognise an emulation scenario by the presence of a qemu binary at some location relative to the target executable, but this is very custom thing, and I don’t know how I would go about expressing this in a way that makes sense upstream.