Simultaneous multiple target debugging

Has anybody done any work on integrating features into LLDB to allow for 'meaningful' simultaneous multiple target debugging? There are various scenarios in which this is a very valuable feature:

1) coprocessor debugging, in single-process systems (i.e, embedded DSP alongside say a host CPU core)
2) graphical debugging, e.g. games: ideally you want to be able to debug the CPU code alongside any GPU workgroups, and have a single interface to any shared resources such as memory.

We've done work like this in the past to LLDB, it's not been contributed back because we couldn't do so for commercial reasons (and it's not in a state to contribute back, either). However in the future I think this will become a 'killer app' feature for LLDB and we should be planning to support it.

At the moment we can have multiple targets, processes etc running in an LLDB session. However I am failing to see any system for communication and interpretation of multiple targets as a whole. If we take the DSP/CPU situation, I may be watching a CPU memory location whilst at the same time single-stepping through the DSP. It's currently undefined and a bit unknown as to how this situation would work in LLDB as stands. From what I can see, it's quite hard to use the current independent target framework to achieve a meaningful debugging session.

It's as though we'd want some sort of session object, that can take multiple targets together and understand how they operate as to achieve some sort of well-defined behaviour in how it's debugged. I.e, in the DSP/CPU scenario, the session object would understand the DSP has access to the CPU memory, and as such, if we're currently on the DSP single stepping, it would allow a CPU watchpoint event through to the DSP session, with an ability to switch target.

There are many more items we'd need to allow communication between. A quick example, we have an LLDB version here that supports non-stop mode debugging (see https://sourceware.org/gdb/current/onlinedocs/gdb/Non_002dStop-Mode.html - and we _will_ contribute this back). At the moment stepping through one thread and a breakpoint happens in another is a bit nasty: LLDB simply switches to whatever thread id is greater. When this sort of usability issue exists in a single-target fashion, we may need to look at extracting this out into some sort of policy system that targets (and, these theoretical session objects) can use to decide how to handle certain event situations.

Apologies if this is a bit of a brain dump. It's quite a complex concept, which is why I think dialogue needs to start now as it's something as I've mentioned we are actively doing at Codeplay, but when the time comes to push upstream, want to do so in a way the community thinks is valuable. There may be other viewpoints, like 'super debugservers' that can manage multiple targets and spoof a single target to LLDB, for example.

Any other opinions or thoughts out there? :slight_smile:

Colin

In fact, would any folk be interested in a BOF session on this topic at the next meeting?

Hey Colin!

In fact, would any folk be interested in a BOF session on this topic at the next meeting?

I would certainly attend if given the opportunity :slight_smile:

-Todd

We certainly designed lldb with this possibility in mind, but haven't had any need for it yet, so the support remains sketchy. Maybe the "Platform" is the right agent to know how to cons up a conjoint debug session of the sort you describe. But we lack the entity that would manage such a composite debug session once the platform produced it. That would be a great addition.

It would be interesting to see if we could use the same idea to do things like debug client/server IPC, so you could step across an IPC call from the client to the server. Note that this sort of thing would require more OS semantic knowledge (and probably more user configuration) than we tend to want to stick in debugserver, so I think this is properly done in lldb.

Jim

As Jim stated we have the architecture to support multiple target debugging, but we haven't done much with it right now.

The current state is you can switch between targets and so stuff:

(lldb) file /bin/ls
(lldb) b malloc
(lldb) r
(lldb) file /bin/cat
(lldb) b free
(lldb) r

Now two targets are available:

(lldb) target list
(lldb) target select 0
(selects /bin/ls target)
(lldb) .... (do commands for /bin/ls target)
(lldb) target select 1
(selects /bin/cat target)
(lldb) .... (do commands for /bin/cat target)

We currently don't have anything that allows automatic switching of targets based on any criteria. If a target 0 stops asynchronously while target 1 is selected, we print that something happened on target 0, but we don't just emit a stop into and switch the target automatically.

So there is pleeeeennnnty of room for improvement and polish. The platform is a good place to manage two related targets and we should probably expand the platform to do a lot of this management.

I look forward to hearing from what you come up with.

And I do believe that a BOF on future directions for LLDB would be a good idea. We could easily cover:
1 - lldb-gdbserver along with NativeProcess and NativeThread
2 - multiple target debugging
3 - MI interface
4 - ???

Greg

As Jim stated we have the architecture to support multiple target debugging, but we haven't done much with it right now.

The current state is you can switch between targets and so stuff:

(lldb) file /bin/ls
(lldb) b malloc
(lldb) r
(lldb) file /bin/cat
(lldb) b free
(lldb) r

Now two targets are available:

(lldb) target list
(lldb) target select 0
(selects /bin/ls target)
(lldb) .... (do commands for /bin/ls target)
(lldb) target select 1
(selects /bin/cat target)
(lldb) .... (do commands for /bin/cat target)

We currently don't have anything that allows automatic switching of targets based on any criteria. If a target 0 stops asynchronously while target 1 is selected, we print that something happened on target 0, but we don't just emit a stop into and switch the target automatically.

So there is pleeeeennnnty of room for improvement and polish. The platform is a good place to manage two related targets and we should probably expand the platform to do a lot of this management.

I think the platform should be able to provide the components you could hook up to make a "composite debug session" but then we will need some entity to manage the session once instantiated. I don't think the platform is the right guy to do this latter task. It's more of a singleton that described that things that are available in the platform. It shouldn't do "I've initiated an RPC, and now I'm waiting for the other end to act on it, so I can stop it there." type stuff. That's what a "session manager" or something like that would be for.

Jim

Hi Colin,

Multiple target debugging is a massive interest to us at CSR. We design chips with various processor types (e.g. kalimba, XAP, 8051, ARM etc) and on several of our chips we have multiple-processors. There are lots of combinations of setups that we have either already done, or are actively experimenting on. Generally, we have heterogenous setups (e.g. XAP+8051, or 4*XAP+kalimba+8051) etc.

I see that lldb already supports the concept of a target list, an active target and manual switching between current targets. However, as Colin alludes, there are several features associated with multiple-target which require control from a higher-level.

What we currently have in our existing debuggers is options of the form, "I'm debugging targets A and B, if A stops do I want B stop as well?". The answer to that question is very much specific to that user's current debug scenario. Of course, getting B to stop if A does, is best implemented in the hardware, and typically a register will be available as a mechanism to configure this feature. In our (CSRs) world probably one of the processors will have access to the associated hardware block, and our debugger will talk to this target to access the feature.

So, of course, if non-active target(s) stops whilst stepping/running the active one, some notification needs to be passed up, informing the debug session controller of this, and determining whether or not to switch active target.

Greg and Jim both mentioned using the Platform class as the place to implement this kind of thing. However, does the Platform not only deal in homogenous entities? Is it correct to use this concept to control different processor families. With my limited lldb architectural knowledge, I would have thought that the most likely candidate to control this is the Debugger object itself.

Matt

Colin Riley wrote:

Has anybody done any work on integrating features into LLDB to allow for 'meaningful' simultaneous multiple target debugging? There are various scenarios in which this is a very valuable feature:

1) coprocessor debugging, in single-process systems (i.e, embedded DSP alongside say a host CPU core)
2) graphical debugging, e.g. games: ideally you want to be able to debug the CPU code alongside any GPU workgroups, and have a single interface to any shared resources such as memory.

We've done work like this in the past to LLDB, it's not been contributed back because we couldn't do so for commercial reasons (and it's not in a state to contribute back, either). However in the future I think this will become a 'killer app' feature for LLDB and we should be planning to support it.

At the moment we can have multiple targets, processes etc running in an LLDB session. However I am failing to see any system for communication and interpretation of multiple targets as a whole. If we take the DSP/CPU situation, I may be watching a CPU memory location whilst at the same time single-stepping through the DSP. It's currently undefined and a bit unknown as to how this situation would work in LLDB as stands. From what I can see, it's quite hard to use the current independent target framework to achieve a meaningful debugging session.

It's as though we'd want some sort of session object, that can take multiple targets together and understand how they operate as to achieve some sort of well-defined behaviour in how it's debugged. I.e, in the DSP/CPU scenario, the session object would understand the DSP has access to the CPU memory, and as such, if we're currently on the DSP single stepping, it would allow a CPU watchpoint event through to the DSP session, with an ability to switch target.

There are many more items we'd need to allow communication between. A quick example, we have an LLDB version here that supports non-stop mode debugging (see https://sourceware.org/gdb/current/onlinedocs/gdb/Non_002dStop-Mode.html - and we _will_ contribute this back). At the moment stepping through one thread and a breakpoint happens in another is a bit nasty: LLDB simply switches to whatever thread id is greater. When this sort of usability issue exists in a single-target fashion, we may need to look at extracting this out into some sort of policy system that targets (and, these theoretical session objects) can use to decide how to handle certain event situations.

Apologies if this is a bit of a brain dump. It's quite a complex concept, which is why I think dialogue needs to start now as it's something as I've mentioned we are actively doing at Codeplay, but when the time comes to push upstream, want to do so in a way the community thinks is valuable. There may be other viewpoints, like 'super debugservers' that can manage multiple targets and spoof a single target to LLDB, for example.

Any other opinions or thoughts out there? :slight_smile:

Colin

Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Keep up to date with CSR on our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at www.twitter.com/CSR_plc.
New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com.

Greg and Jim both mentioned using the Platform class as the place to implement this kind of thing.

I think Jim later mentioned a higher-level concept is needed to do some of the orchestration that we’d want to enable, IIRC.

Right, it seems to me clear that you need two entities.

One knows what targets can be created in a given debugging scenario, and how to hook up to them. Then you need another to manage picking some subset of these targets, and coordinating the processes running in each of them.

The Platform seemed the logical place to do the first job. However, Matthew is right that at present the Platforms are homogenous, and more deal with OS details. So maybe it would be better to keep the Platform more about OS details, then we could add a "device" abstraction that represents composite entities with multiple debuggable opportunities, and then each of these "debuggable opportunities" would have a Platform to represent the OS like features of this opportunity (need some good word for this.) That might be a better way to go. Note that the "debuggable opportunities" are more general than just different devices on a board. For instance, you could imagine debugging the kernel, and a user-space process running on that OS, and coordinating those just as you would a main processor and a co-processor... To make matters a little confusing, a "device" might represent all the processes running on a single OS, since that's not formally different from the more straightforward device scenario. So in some ways a platform IS also a device in this sense. Maybe the abstraction is more a target provider, and the Platform is a homogenous target provider, in addition to its OS duties, and a device is a heterogenous target provider?

But in either case, once you've chosen to attach to several debug sockets, there's the whole business that Matthew mentions of coordinating the sessions. That is clearly a whole different kettle of fish from just "what can I attach to".

BTW, this coordinating entity should not be restricted to different devices. At that level, of course, it is really about coordinating targets & their process regardless of where they come from. For instance you'd want to be able to use the same structure to coordinate debugging message passing or socket traffic, etc on two user space processes on the same or different systems. It would also be interesting to model this coordination in a way that could also be extended to threads in a single process. Right now, each thread's behavior is programmed using the ThreadPlans which work only on a per thread basis and don't make any attempt to coordinate threads. But it would be useful (and more so when you start doing keep alive debugging) to have some way to program "when thread A does X, wait for thread B to do Y..." That isn't formally different from two processes or several co-processors. Be interesting to see how much of the coordination we could make very general.

Jim

Yes, a "device" abstraction seems to be the correct controlling entity. In fact, from an embedded debugging perspective it is __the__ logical entity which groups "debuggable opportunities" together. However, when Jim mentions "this coordinating entity should not be restricted to different devices " and alludes to control over different targets (which are in __some_way__ associated in the debugging user's mind), but may be running on different machines etc.; then I think that conceptually we are still talking about the same thing, but the name "device" then becomes questionable. I can only really think of something a tad wooly like "DebugScenario", "DebugSession" or "DeploymentScenario"... :frowning:

Yes, the Platform should remain just being the Platform.

Regarding "debuggable opportunities" - solely these are just the "Target" objects that we already have? (In fact Colin's original post does in fact just state "that can take multiple targets together and understand how they operate...").

What's really tricky, I think, is how to make the device/scenario controlling entity look very generic on the outside, but within be able to coordinate very target specific activities. It seems that the debuggable_oppurtunity/target would require some way of communicating the kind of multi-target features it can support e.g.

CanStopOthers
CanBeUnselectedAsActive
...and so on...

Interested to see how things pan out.

Matt

jingham@apple.com wrote:

It might be nice to mock up just the debugger command streams we think are needed/wanted to handle several common usages of the heterogeneous processor debugging scenarios on this thread before putting any code behind it. That way we can talk through it a bit with concrete examples to further illuminate the kinds of changes/support we’ll need.

Yeah, I’ve been thinking about the different streams you’d need. A difficult one is a call stack on a platform, with frames on different ABIs/Targets representing cross-architecture calls.

If anyone here has ever used the Cell PPU/SPU gdb, which had this feature, I think you’ll agree it’s absolute gold in terms of value.

Sadly I can’t attend this years meeting, but I’ll write a BOF proposal and send it in anyway with one of the other Codeplayers there to host.

Colin

Yeah, I've been thinking about the different streams you'd need. A
difficult one is a call stack on a platform, with frames on different
ABIs/Targets representing cross-architecture calls.

If anyone here has ever used the Cell PPU/SPU gdb, which had this feature,
I think you'll agree it's absolute gold in terms of value.

That was one thing I was thinking of!

Sadly I can't attend this years meeting, but I'll write a BOF proposal and
send it in anyway with one of the other Codeplayers there to host.

Drats! Sorry we'll miss you there, Colin. I'm sure we'll appreciate
whoever does come from Codeplay!