Synchronous execution with process plugin

I ran into an issue earlier where I tried to make a .lldbinit file with some lines like this:

file a.out
run

When this happens the process runs, the breakpoint gets hit and I see the source listing, it returns to the lldb prompt, but then I can’t type anything. It appears LLDB is deadlocked inside of Target::Launch() at the following location:

if (!synchronous_execution)
m_process_sp->RestoreProcessEvents ();

error = m_process_sp->PrivateResume();

if (error.Success())
{
// there is a race condition where this thread will return up the call stack to the main command
// handler and show an (lldb) prompt before HandlePrivateEvent (from PrivateStateThread) has
// a chance to call PushProcessIOHandler()
m_process_sp->SyncIOHandler(2000);

if (synchronous_execution)
{

state = m_process_sp->WaitForProcessToStop (NULL, NULL, true, hijack_listener_sp.get(), stream);
const bool must_be_alive = false; // eStateExited is ok, so this must be false
if (!StateIsStoppedState(state, must_be_alive))
{
error.SetErrorStringWithFormat(“process isn’t stopped: %s”, StateAsCString(state));
}
}
}

Normally when I’m using LLDB and entering the commands myself, this synchronous_execution value is not set, and everything works as expected. How is this supposed to work? What does my plugin need to do differently in order to handle this case? The process has already stopped once and resumed, so I’m not sure why it would need to stop again? I see that it’s not restoring process events in the case of synchronous execution, so maybe it should have never resumed in the first place?

In synchronous execution, the "Launch" command won't return till the process has stopped. The point of synchronous execution is that you can do:

break set -n foo
run
bt

So "run" can't return till the breakpoint has been hit. That is why it waits for the process to stop. I'm not quite sure why this is done in Target::Launch, in other cases (e.g. in for "step" and "continue" the command object is the one that takes care of waiting for the stop. Launch is a little funny however, because it can't use the normal process wait mechanism to do its job since the real process isn't alive when it has to start waiting...

I think the reason you are hanging here is that the code that reads in all the init statements runs an event loop temporarily while it is reading them in, and the kills that and hands off the the real command execution loop, and this continuation gets lost in the handoff. I thought Greg had already fixed that, but maybe it's still sitting in his queue.

Jim

I’m a little confused. You said that in synchronous execution, Launch won’t return until the process has stopped. That makes sense, but it already checks that the process has stopped once regardless of whether synchronous execution is set. Then, it calls PrivateResume() (even if synchronous_execution is set), and then waits for the process to stop again? What would trigger this second stop? Target::Launch already asked it to resume, so now it’s happily running while Target::Launch is waiting for it to stop a second time.

That's the stop at entry stop. The code you quoted is in a block that starts with:

        if (launch_info.GetFlags().Test(eLaunchFlagStopAtEntry) == false)
        {

So we've stopped at the entry point, but the user didn't want to know about that, so we resume and wait for a "real" stop.

Jim

If that’s the case, then a .lldbinit file like this:

file a.out
run

Will deadlock the debugger, because the real stop never comes?

I'm not sure it is deadlocking the debugger. lldb is just waiting for a stop. For instance ^C should interrupt it, or sending a signal externally to the process, or triggering a breakpoint or crash, etc.

Actually, Greg must have fixed the bug I was remembering, because this works correctly for me with TOT lldb.

What happens for you if your .lldbinit has:

file a.out
break set -n main
run

For me this stops at the breakpoint at main. We still have a little clean up to do here, because I don't see the stop notification in this case. I see:

> lldb -S cmds.lldb
(lldb) command source -s 1 'cmds.lldb'
4 locations added to breakpoint 1
(lldb)

but if I then do "bt" I'm sitting at main:

(lldb) bt
* thread #1: tid = 0x1bf8a4, function: main , stop reason = breakpoint 1.1
  * frame #0: 0x0000000100018e87 Sketch`main at SKTMain.m:17
    frame #1: 0x00007fff94f445ad libdyld.dylib`start
    frame #2: 0x00007fff94f445ad libdyld.dylib`start

Not sure what's up with the stutter at start either. But that's a different rabbit to chase...

Jim

If I do that my process does stop at main, but I think I’m hit with a race condition. 3 times out of 5 I saw the same results as you, where it doesn’t print the backtrace but if I do bt it works fine. The other 2 times out of 5 It does print the backtrace, but then I’m deadlocked. I added a bunch of tracepoints so we can see the order of events. here’s what it looks like when it deadlocks.

Target::Launch about to call ProcessWindows::Launch
ProcessWindows::DoLaunch entering
ProcessWindows::DoLaunch about to call DebugLaunch
ProcessWindows::DoLaunch succeeded, waiting for initial stop
Application “??\D:\src\llvm\tools\lldb\test\expression_command\formatters\a.out” found in cache
ProcessWindows::OnDebuggerConnected

ProcessWindows::OnDebugException

ProcessWindows got EXCEPTION_BREAKPOINT
ProcessWindows got initial stop, setting initial stop event
ProcessWindows::DoLaunch received initial stop, returning
ProcessWindows::DoLaunch finished launching process, exiting
Target::Launch returned from ProcessWindows::DoLaunch, eLaunchFlagStopAtEntry == false, about to call WaitForProcessToStop
Target::Launch, WaitForProcessToStop returned, new state = eStateStopped
Target::Launch about to call ProcessWindows::PrivateResume
ProcessWindows::DoResume resuming from active exception
ProcessWindows::DoResume couldn’t find an active exception, setting state to eStateRunning
ProcessWindows::OnDebugException

ProcessWindows got EXCEPTION_BREAKPOINT
Target::Launch synchronous_execution is true, calling WaitForProcessToStop again

The interesting stuff is the last 2 lines. It hits the breakpoint in main before calling WaitForProcessToStop, so WaitForProcessToStop never sees a “change”.

Here’s what it looks like when it works.

Target::Launch about to call ProcessWindows::Launch
ProcessWindows::DoLaunch entering
ProcessWindows::DoLaunch about to call DebugLaunch
ProcessWindows::DoLaunch succeeded, waiting for initial stop
Application “??\D:\src\llvm\tools\lldb\test\expression_command\formatters\a.out” found in cache
ProcessWindows::OnDebuggerConnected

ProcessWindows::OnDebugException

ProcessWindows got EXCEPTION_BREAKPOINT
ProcessWindows got initial stop, setting initial stop event
ProcessWindows::DoLaunch received initial stop, returning
ProcessWindows::DoLaunch finished launching process, exiting
ProcessWindows breakpoint handler set private state to eStateStopped
Target::Launch returned from ProcessWindows::DoLaunch, eLaunchFlagStopAtEntry == false, about to call WaitForProcessToStop
Target::Launch, WaitForProcessToStop returned, new state = eStateStopped
Target::Launch about to call ProcessWindows::PrivateResume
ProcessWindows::DoResume resuming from active exception
ProcessWindows::DoResume couldn’t find an active exception, nothing to resume
ProcessWindows::OnDebugException

ProcessWindows got EXCEPTION_BREAKPOINT
Target::Launch synchronous_execution is true, calling WaitForProcessToStop again
ProcessWindows breakpoint handler set private state to eStateStopped
Target::Launch WaitForProcessToStop returned, yay!

The difference here is that “calling WaitForProcessToStop again” happens before the process plugin sets the private state. Can you try repeating your test multiple times and seeing if you can get the deadlock to occur?

If I do that my process does stop at main, but I think I'm hit with a race condition. 3 times out of 5 I saw the same results as you, where it doesn't print the backtrace but if I do bt it works fine. The other 2 times out of 5 It does print the backtrace, but then I'm deadlocked. I added a bunch of tracepoints so we can see the order of events. here's what it looks like when it deadlocks.

Target::Launch about to call ProcessWindows::Launch
ProcessWindows::DoLaunch entering
ProcessWindows::DoLaunch about to call DebugLaunch
ProcessWindows::DoLaunch succeeded, waiting for initial stop
Application "\??\D:\src\llvm\tools\lldb\test\expression_command\formatters\a.out" found in cache
ProcessWindows::OnDebuggerConnected
ProcessWindows::OnDebugException
ProcessWindows got EXCEPTION_BREAKPOINT
ProcessWindows got initial stop, setting initial stop event
ProcessWindows::DoLaunch received initial stop, returning
ProcessWindows::DoLaunch finished launching process, exiting
Target::Launch returned from ProcessWindows::DoLaunch, eLaunchFlagStopAtEntry == false, about to call WaitForProcessToStop
Target::Launch, WaitForProcessToStop returned, new state = eStateStopped
Target::Launch about to call ProcessWindows::PrivateResume
ProcessWindows::DoResume resuming from active exception
ProcessWindows::DoResume couldn't find an active exception, setting state to eStateRunning
ProcessWindows::OnDebugException
ProcessWindows got EXCEPTION_BREAKPOINT
Target::Launch synchronous_execution is true, calling WaitForProcessToStop again

The interesting stuff is the last 2 lines. It hits the breakpoint in main *before* calling WaitForProcessToStop, so WaitForProcessToStop never sees a "change".

Here's what it looks like when it works.

Target::Launch about to call ProcessWindows::Launch
ProcessWindows::DoLaunch entering
ProcessWindows::DoLaunch about to call DebugLaunch
ProcessWindows::DoLaunch succeeded, waiting for initial stop
Application "\??\D:\src\llvm\tools\lldb\test\expression_command\formatters\a.out" found in cache
ProcessWindows::OnDebuggerConnected
ProcessWindows::OnDebugException
ProcessWindows got EXCEPTION_BREAKPOINT
ProcessWindows got initial stop, setting initial stop event
ProcessWindows::DoLaunch received initial stop, returning
ProcessWindows::DoLaunch finished launching process, exiting
ProcessWindows breakpoint handler set private state to eStateStopped
Target::Launch returned from ProcessWindows::DoLaunch, eLaunchFlagStopAtEntry == false, about to call WaitForProcessToStop
Target::Launch, WaitForProcessToStop returned, new state = eStateStopped
Target::Launch about to call ProcessWindows::PrivateResume
ProcessWindows::DoResume resuming from active exception
ProcessWindows::DoResume couldn't find an active exception, nothing to resume
ProcessWindows::OnDebugException
ProcessWindows got EXCEPTION_BREAKPOINT
Target::Launch synchronous_execution is true, calling WaitForProcessToStop again
ProcessWindows breakpoint handler set private state to eStateStopped
Target::Launch WaitForProcessToStop returned, yay!

The difference here is that "calling WaitForProcessToStop again" happens before the process plugin sets the private state. Can you try repeating your test multiple times and seeing if you can get the deadlock to occur?

I can't get this to fail on OS X but with timing related things I'm not sure that's all that significant...

Anyway, what you are describing doesn't make sense to me yet. After calling PrivateResume, we call WaitForProcessToStop, passing "wait_always" set to true. When wait_always is true, WaitForProcessToStop won't return till it receives a stop event. It isn't comparing current state against desired state, it is actually waiting for an event. So it should not matter when the breakpoint hit arrives relative to calling "WaitForProcessToStop" since the breakpoint hit is just going to generate a stop event, which will sit in the event queue till somebody fetches it. It's fine for that to happen before you call WaitForProcessToStop, because we aren't comparing current to desired state, we are waiting for that event.

Maybe somebody else is grabbing the breakpoint stop event and taking it off of the queue? That would explain why the second WaitForProcessToStop - the one with wait_always set to true - is missing the event?

Jim

Hmm, on Monday I’ll try to see if i can figure out where to put a breakpoint to get hit when someone removes something from the queue. Thanks

This looks very much like the issue I was experiencing on linux, where
lldb would just lock up if I executed the process in synchronous mode.
This was fixed with <http://reviews.llvm.org/D8079>, and I suspect you
need to do something similar. The trick here is that you need to set
up process event hijacking in your host launch code. Otherwise,
Target::Launch will attempt to use the global process listener to wait
for the process to stop. However, this will race with the event
handler thread, since it uses the same listener to process events. So,
depending on which thread processes the event, it will either work
fine, or you will end up locked in WaitForProcessToStop forever,
because the event will never come.

Also, note that there is another issue with launching in synchronous
mode, where stdio is not forwarded to the target process
<http://lists.cs.uiuc.edu/pipermail/lldb-dev/2015-March/006853.html>.
I was planning on dealing with this, but I don't think I will get
around to it for a while yet...

pl

We've seen the same behavior when running an lldb command file at startup.

Foo contains:
b main
run

lldb -s foo test.exe
(this hangs)

Take out the run from foo, and
lldb -s foo -o run test.exe
(this works)

Hello,

Look at it: http://reviews.llvm.org/D8541
Perhaps it will fix your test cases too.

Thanks,
Ilia