[PATCH] Unexpected group-stop with threading under Linux

Andrew_MacPherson · January 27, 2014, 10:47am

This patch addresses a bug where in a multi-threaded program a new signal from the inferior may be received before all group-stop messages from an earlier signal have been handled. The attached .cpp file exposes the issue:

clang++ -o thread thread.cpp -lpthread -g -O0
lldb ./thread
r
hit Ctrl-C
br set -f thread.cpp -l 11
c

A breakpoint is usually hit before all the group-stop messages involved in restarting the process are handled and the internal state then gets out of sync.

The attached patch solves the issue and all unit tests pass, I’m not a ptrace expert though so there may be a more appropriate fix. The group-stop handling was taken from the existing ProcessMonitor::MonitorCallback.

Thanks,

Andrew

thread.cpp (443 Bytes)

extra-group-stop.patch (1.53 KB)

Todd_Fiala · January 27, 2014, 5:00pm

I’m giving this a shot now locally. The change looks reasonable to me - any ptrace experts disagree?

The one bit I’m not entirely sure about is: if assertions are disabled, and for some reason the WSTOPSIG(status) is not SIGSTOP (since we’re not explicitly checking before the Resume(wait_pid, SIGSTOP) call), how badly can that gunk things up? An if check before the Resume call would at least stop the wrong signal from getting sent to the inferior if I’m reading it right.

Todd_Fiala · January 27, 2014, 5:06pm

Verified this fixes it. I’ll go ahead and check it in.

Andrew_MacPherson · January 27, 2014, 5:09pm

Hey Todd, great! I think if we end up in there and it’s not a SIGSTOP then we have no idea what’s going on and it’s not clear what the correct behaviour would be. I’m happy to wrap the whole thing in an if statement checking for SIGSTOP though if you think that’s best, though it’s not clear whether the thing to do is to continue or return at that point.

Thanks!

Todd_Fiala · January 27, 2014, 5:09pm

Sending source/Plugins/Process/Linux/ProcessMonitor.cpp
Transmitting file data .
Committed revision 200226.

Todd_Fiala · January 27, 2014, 5:16pm

I think if we end up in there and it’s not a SIGSTOP then we have no idea what’s going on and it’s not clear what the correct behaviour would be.

After looking at the whole path, it really isn’t clear to me what the right thing would be to do in that case as you said, so I was fine leaving it as is. I think we’d be hosed either way at that point. I’ll be working deeper in this code soon anyway, so if other reasonable causes for the error surface, I’ll revisit the code at that time.

Topic		Replies	Views
[Bug 23574] New: lldb is unable to pass SIGSTOP to the inferior correctly LLDB	0	88	May 19, 2015
[PATCH] Fix crash on Linux if sem_wait() is interrupted LLDB	6	104	January 24, 2014
Limbo LLDB	15	76	December 20, 2012
Thread state LLDB	9	67	May 8, 2013
[Bug 43959] New: 'process signal ...' should probably NOT stop the process LLDB	0	85	November 11, 2019

[PATCH] Unexpected group-stop with threading under Linux

Related Topics