Why is storing SBTarget in a private field causing random crash?

Hi,

I have been spending long time troubleshooting a race condition in our lldb python test. I finally narrowed down to one code change that caused the race: basically, whenever I store SBTarget in DebuggerTestCase’s self.target field lldb will randomly crash during destruction(see below).

In the code, if I modify the two “self.target” to local variable “target” the random crash will disappear.

I am not a python expert. Why is holding SBTarget will cause the test to random crash? Do I have to set every SBXXX fields to None before calling SBDebugger.Destroy()?

==========================Crash Stack==========================

Crashed Thread: 0 Dispatch queue: com.apple.main-thread

Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000010

VM Regions Near 0x10:

__TEXT 000000010d145000-000000010d146000 [ 4K] r-x/rwx SM=COW /System/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 com.apple.LLDB.framework 0x00000001101c037d lldb_private::Listener::BroadcasterWillDestruct(lldb_private::Broadcaster*) + 95
1 com.apple.LLDB.framework 0x00000001101a0da2 lldb_private::Broadcaster::Clear() + 50
2 com.apple.LLDB.framework 0x00000001101a0ced lldb_private::Broadcaster::~Broadcaster() + 75
3 com.apple.LLDB.framework 0x00000001103d6879 lldb_private::Target::~Target() + 741
4 com.apple.LLDB.framework 0x00000001103d6c20 lldb_private::Target::~Target() + 14
5 libc++.1.dylib 0x00007fff896448a6 std::__1::__shared_weak_count::__release_shared() + 44
6 com.apple.LLDB.framework 0x000000010e560664 _wrap_delete_SBTarget(_object*, _object*) + 123
7 org.python.python 0x000000010d15a50a PyObject_Call + 99

==========================Code==========================

from find_lldb import lldb
from simplest_event_thread import LLDBListenerThread
import unittest
import threading

running_signal = threading.Event()
stopped_signal = threading.Event()

def launch_debugging(debugger, stop_at_entry):
error = lldb.SBError()
listener = lldb.SBListener(‘Chrome Dev Tools Listener’)
target = debugger.GetSelectedTarget()
process = target.Launch (listener,
None, # argv
None, # envp
None, # stdin_path
None, # stdout_path
None, # stderr_path
None, # working directory
0, # launch flags
stop_at_entry, # Stop at entry
error) # error
print ‘Launch result: %s’ % str(error)

event_thread = LLDBListenerThread(debugger, running_signal, stopped_signal)
event_thread.start()

running_signal.set()
return event_thread

class DebuggerTestCase:

def wait_for_process_stop(self):
running_signal.wait()
running_signal.clear()
stopped_signal.wait()
stopped_signal.clear()

def test_breakpoint_at_line(self):
debugger = lldb.SBDebugger.Create()
debugger.SetAsync(True)
executable_path = ‘~/Personal/compiler/CompilerConstruction/code/compiler’
self.target = debugger.CreateTargetWithFileAndArch(executable_path, lldb.LLDB_ARCH_DEFAULT)

event_thread = launch_debugging(debugger, stop_at_entry=True)

process = debugger.GetSelectedTarget().process
self.wait_for_process_stop() # wait for entry breakpoint.
self.target.BreakpointCreateByName(‘main’)
process.Continue()
self.wait_for_process_stop() # wait for main breakpoint.

event_thread.should_quit = True
event_thread.join()
lldb.SBDebugger.Destroy(debugger)

if name == ‘main’:
test = DebuggerTestCase()
for i in range(20):
test.test_breakpoint_at_line()

This is a clear bug in LLDB. If you have a repro case, please file a bug and attach the instructions on how to make this happen. Our API must be able to handle things like this.

SBTarget has a shared pointer to a lldb_private::Target. If you have a reference to a target, it should keep that target alive and it shouldn't crash later when it is actually freed.

So please file a bug and we'll get it fixed. For a work around for now, yes, setting all SB references to None should help you work around the issue.

Greg

The broadcasters and listeners depend mutually on one another. The listener keeps a list of the broadcasters it is listening to, and the broadcaster a list of the listeners it is broadcasting to. When the broadcaster goes down it removes the listeners from its list and ditto for the listeners and their broadcaster list. And these lists are protected by mutexes, so they should keep each other out of trouble. But it looks like there's some way that we can get a listener destroyed but not removed from the broadcaster's list by the point where the broadcaster goes to remove it. I've seen reports of this but not reproducible ones. If you have a case that reproduces easily, I would love to take a look at it.

Jim