Thread Safety Analysis: negative capabilities and visibility

TL;DR: ongoing discussion about thread safety analysis and deadlock
prevention. Posted to the list for the benefit of interested readers.

The new negative capabilities patch (r214789) solves a long-standing
problem whereby LOCKS_EXCLUDED was not transitive. For example,
assume you have the following:

Mutex mu;
void a() { mu.lock(); b(); mu.unlock(); }
void b() { c(); }
void c() LOCKS_EXCLUDED(mu);

There is no warning in this code, because a() does not call c()
directly, and even though b() calls a function that excludes mu, it
does not have to exclude mu itself (although it should). Negative
capabilities solve this problem by propagating a negative requirement
(a requirement that mu is not held) in the same way as an ordinary
requirement, so if we change c() to:

void c() REQUIRES(!mu);

Then b() must also declare REQUIRES(!mu) to avoid a warning, and then
we get a warning in a(), which is what we want, because a() calls b()
with mu held.

However, there is no way to acquire a negative capability. In other
words, you cannot prove to the analysis that mu is *not* held, except
by putting a REQUIRES(!mu) on the calling function. Thus, an
attribute like REQUIRES(!mu) will simply propagate outwards, up the
call graph. At some point, we want to stop the propagation; otherwise
every function will have to exclude all of the mutexes that might be
acquired by anything it calls, which is not feasible.

Right now, I stop the propagation at the boundary of the enclosing
class. Thus, every class method must exclude mutexes that are local
to the class, but a method does not have to exclude mutexes that are
declared in other classes. This is overly conservative because it
does not consider (1) public or protected mutexes inherited from base
classes (2) mutexes in friend classes, or (3) global mutexes.

Aaron has written a patch that uses the normal C++ rules for
visibility -- e.g. a function must exclude any mutex that is publicly
visible from the current scope. At first, that made sense to me, but
now I'm afraid it might extend visibility too far, because public and
global mutexes would still propagate everywhere.

Global mutexes in particular are a problem, because C++ does not have
any way to restrict the visibility of a global variable to a single
"module." Global mutexes are frequently declared within a header file,
and then used within multiple translation units. Moreover, deadlock
prevention in such cases is important. Because visibility of the
mutex extends so far, it is easy to acquire the same mutex twice,
which is what the negative capability is supposed to prevent. On the
other hand, we don't want the mutex to be visible outside of the scope
in which it is used, even if the "scope" is poorly-defined.

The same holds true for a protected mutex in a base class that is
inherited by many derived classes. Should the mutex be regarded as
visible, or not?

I'm looking for any ideas. Should I add a new attribute to control
visibility of mutexes?

  -DeLesley

Once there's a requirement that mu not be held, anywhere mu is visible
would need to be analyzed, which could require cross-TU analysis and
be a rather hard problem to solve with our current architecture. So I
agree that propagation is problematic here.

I think one thing we could do is when seeing REQUIRES(!mu), we test
mu's visibility and if it can escape the TU, diagnose on that usage.
Eg) static file-scope mu we can possibly track. class private mu we
can track. protected, public, global mu... I don't see that being
feasible to track without false positives, so we should tell the user
"sorry, we can't support this requirement fully, but we'll try our
best." That at least sets the user's expectations, though it doesn't
help them too much.

I'm not certain how a new visibility attribute would help; could you
describe what you were thinking of a bit more?

Thanks!

~Aaron

It doesn't require cross-TU analysis, because we're not inferring
anything. Instead, we're using explicit attributes. However, that
means that the programmer must put an explicit REQUIRES(!mu) on every
function where mu is visible, so restricting visibility is important.

Easy solutions are:

(*) For class members, we can restrict visiblity to the current class.
(*) For global mutexes, which are declared only in a .cpp file, we can
restrict visibility to that file.

For all other mutexes, we suppress warnings. However, that leads to
false negatives, in the case where a mutex's visiblity should
legitimately cross class or TU boundaries.

My idea for an attribute would be declare, on a per-mutex basis, what
the visibility of that mutex should be. E.g. "class-local" (the
default), "name-space local", "TU-local", "module-local" etc. Given
the current lack of a good module structure for C++, though, I'm not
sure how effective that would be, or whether it might merely
complicate an already bad situation.

  -DeLesley

It doesn't require cross-TU analysis, because we're not inferring
anything.

Yes, we're not inferring anything, but we're also unable to check
situations that would matter. Eg) extern function declared in a.cpp
with a mismatched annotation defined in b.cpp. But I suppose that's no
different than where we're at without negative capabilities either.

Instead, we're using explicit attributes. However, that
means that the programmer must put an explicit REQUIRES(!mu) on every
function where mu is visible, so restricting visibility is important.

Agreed.

Easy solutions are:

(*) For class members, we can restrict visiblity to the current class.

Public mutexes are hopefully not something you find in the wild, but a
protected mutex may be more prevalent. Could we diagnose anything in
these cases (on the mutex itself), suggesting to restrict access
through annotated functions?

(*) For global mutexes, which are declared only in a .cpp file, we can
restrict visibility to that file.

I wonder how often you'd see extern'ed mutexes in the real world? But
if it's declarated static or is in a nonglobal namespace, this would
certainly work.

For all other mutexes, we suppress warnings. However, that leads to
false negatives, in the case where a mutex's visiblity should
legitimately cross class or TU boundaries.

My idea for an attribute would be declare, on a per-mutex basis, what
the visibility of that mutex should be. E.g. "class-local" (the
default), "name-space local", "TU-local", "module-local" etc. Given
the current lack of a good module structure for C++, though, I'm not
sure how effective that would be, or whether it might merely
complicate an already bad situation.

I agree that the effectiveness is questionable (whether we'd get
enough benefit to justify the annotation itself).

~Aaron

Public mutexes are hopefully not something you find in the wild,

Every variety of mutex can be found in the wild. :slight_smile:

I wonder how often you'd see extern'ed mutexes in the real world?

Frequently. One common case is to have a small library that uses a
single global mutex to protect library operations. The mutex is
declared in a .h file, and then used in several .cpp files. Since
many libraries do not have separate internal vs. external headers, the
mutex ends up being visible outside of its intended translation units.
This could be fixed by moving the mutex declaration out of the header
file, but that could require a lot of refactoring.

What I'm thinking of doing right now is to write a whole-program
inference pass for the negative capabilities, and run that on our code
base. The inference results would give me better statistical data on
what common usage patterns actually are. :slight_smile:

  -DeLesley