Potential race conditions identified by ThreadSanitizer

Hi,

I have recently built lldb with an automated tool “ThreadSanitizer( or tsan)” designed to detect race conditions and similar concurrency problems. It turned up a number of potential issues. Some may be real, others are probably not an issue in practice. I submitted a patch which fixed a few, either using a mutex, or std::atomic< >. The process of manually analyzing each warning involves a lot of time consuming legwork. In many cases, the work to quiet the warning would add code complexity and runtime hit that may not be worth it.

I need to back-burner this for a bit and move on tow work on something different. I thought I would post some of my findings here as there may still be some real bugs in there.

  1. ==========================

Timer.cpp: g_depth : this is a class static variable used in constructor and destructor. Use of a timer object from different threads will have a race.

CommandInterpreter::HandleCommand() uses Timer scopedTimer; // called from main

Disassembler::FindPlugin() uses scoped_timer, called from event handler thread

  1. ===========================

Editline.cpp/.h : Editline::Hide(), and other functions use m_getting_line from eventhandler thread

EditLine::GetLine() called from main thread via IOHandlerEditLine::Run()

  1. ===========================

IOHandler::Activate(), Deactivate() accesses m_active, called in Debugger::ExecuteIOHandlers() in main thread,

Debugger::PopIOHandler() can also call those functions from event handler thread

  1. ===================

IOHandlerProcessSTDIO::OpenPipes() will read/write to m_pipe from main thread

IOHandlerProcessSTDIO::Cancel() will call m_pipe.Write() from event handler thread

  1. =======================

Process.h: m_private_state_thread - accessed from multiple places from main thread

Process::RunThreadPlan(),

Process::StartPrivateStateThread(),

Process::ControlPrivateStateThread()

Process::PrivateStateThreadIsValid()

and from internal state thread:

Process::RunPrivateStateThread() clears this value when exiting. I don’t think this is safe in all possible cases

  1. ===========================

There is a whole class of problems related to resources created in Process, and used by multiple threads. TSAN complains when the ~Process() destructor is called and destroys those resources. From what I can tell, the threads are typically shut down and not using those resources at that point, but if a thread were to shut itself down asynchronously, I think it is possible for the resources to get destroyed without a formal interlock on those events.

Things like:

m_public_run_lock.SetStopped();

m_private_state_control_wait.SetValue (true, eBroadcastAlways);

m_private_state_thread.Reset();

called in private state thread

In ~Process(), if m_private_state_thread null, it will not call ControlPrivateStateThread() to shut the thread down.

  1. ============================

Similar to point 5:

ProcessGDBRemote::AsyncThread(), can clear m_async_thread when shutting down, while it is access many places from the main thread.

  1. ===========================

Communication::m_read_thread_enabled is read from ReadThread(), but modified from main()

  1. ============================

ProcessGDBRemote::DoDestroy() will call a Socket::Close() from main thread.

An error condition in reading a socket in the async thread will call Disconnect(), accessing the same socket variable

Hi Shawn,

This is really great that you've committed your time to fix some these
issues. But our experience shows that without regular testing new bugs
will appear in the codebase soon.
Perhaps a good idea is to add a buildbot running lldb tests under
ThreadSanitizer, if the lldb team is interested in investing into
that.

Regarding the race reports that are "not an issue in practice", such
cases still violate the C++ Standard, so nobody knows when they're
going to break.

HTH,
Alex.

Hey Alexander,

We totally agree. We’re in the process of getting internal build bots that look for this. We’ve been working with Chandler on getting an externally-visible buildbot infrastructure up.

For the time being, there is a lot of noise in the tsan reports. Since we won’t be able to just focus on squelching all of those at once, we’re likely going to try to filter the tsan output to have it indicate new occurrences, and get to fixing the more important ones as we find them.

That’s the long-term plan, at least.

-Todd

Hey Alexander,

We totally agree. We're in the process of getting internal build bots
that look for this. We've been working with Chandler on getting an
externally-visible buildbot infrastructure up.

For the time being, there is a lot of noise in the tsan reports. Since we
won't be able to just focus on squelching all of those at once, we're
likely going to try to filter the tsan output to have it indicate new
occurrences, and get to fixing the more important ones as we find them.

You can possibly make use of ThreadSanitizer's suppressions mechanism, see

https://code.google.com/p/thread-sanitizer/wiki/Suppressions

That's the long-term plan, at least.

Thanks, Alexander.