[RFC] Adding Thread Role Analysis

As Chandler mentioned, clang already has an existing thread safety analysis, which is based on type systems for static race detection, which I believe you are familiar with:

http://users.soe.ucsc.edu/~cormac/papers/pldi00.pdf

http://users.soe.ucsc.edu/~cormac/papers/toplas06-safe-locking.pdf

The clang annotations can be found at: http://clang.llvm.org/docs/LanguageExtensions.html

In a nutshell, the existing analysis allows a data member to be annotated as being guarded by a particular lock. Similarly, a function can be annotated as requiring a particular lock to be held. The analysis then tracks the set of locks that are known to be held at every program point, and issues warnings when a data member is accessed or a function is called without holding the appropriate lock.

The existing analysis tracks locks, not threads or roles, so concepts like the “GUI thread” cannot be directly expressed. However, it’s not hard to mimic this functionality – the GUI thread acquires a lock named “GUI_lock” when it starts, and releases it when its done, and all code and data that should be GUI-only is annotated as requiring that lock. (A lock in clang is simply a named C++ object; it does not need to implement an actual lock.) Switching of roles can similarly be modeled as releasing one lock and acquiring another. So at first glance, it seems to me that lock-based analysis is roughly similar to the analysis that you propose, and may in fact be significantly stronger; the “related work” section in your paper does not discuss the differences in much detail.

The existing clang annotations are used extensively on production code; we have many hundreds of thousands of lines that are currently annotated and checked. It is not a research project.

In order to move forward, I would like to see a detailed explanation of the differences between your proposal and what is currently implemented, and why you feel the current annotations are insufficient. Ideally, I would prefer to extend the existing annotations and/or analysis to handle any deficiencies, rather than invent a wholly new and incompatible analysis.

-DeLesley

As Chandler mentioned, clang already has an existing thread safety analysis,
which is based on type systems for static race detection, which I believe
you are familiar with:

http://users.soe.ucsc.edu/~cormac/papers/pldi00.pdf
http://users.soe.ucsc.edu/~cormac/papers/toplas06-safe-locking.pdf

The clang annotations can be found at:
http://clang.llvm.org/docs/LanguageExtensions.html

In a nutshell, the existing analysis allows a data member to be annotated as
being guarded by a particular lock. Similarly, a function can be annotated
as requiring a particular lock to be held. The analysis then tracks the set
of locks that are known to be held at every program point, and issues
warnings when a data member is accessed or a function is called without
holding the appropriate lock.

We are familiar with these techniques, which are fundamentally similar to the
lock analysis developed by our Fluid-group colleague Aaron Greenhouse (see the
PPoPP paper's bibliography for references). That said, we feel that TRA has
additional benefits over the existing functionality. The key difference is
this: the Clang annotations, along with the vast majority of other work on
static analysis for thread safety, are focused on lock-based concurrency.
Thread role analysis is focused on NON-lock concurrency, which is a rather
different thing. However, we realize that there can be overlap between the two
techniques, for example with regards to enforcing reader-writer behavior.

Thread role analysis doesn't concern itself with locks, which provides
benefits over the various lock-based analyses which don't do anything
when there's no locking involved. The issues addressed are rather
different. And, it turns out, many widely-used frameworks impose
thread usage policies on their clients.

The easiest example to explain involves the typical threading model
used by most modern GUIs (including at least AWT/Swing and SWT in the
Java world, the X windowing system, and many others). Specifically:

* There is at most one GUI event thread at a time. In some
implementations the identity of this thread is fixed. In others, the
identity of the GUI event thread may change from time to time.

* The GUI does not use locks to protect its data structures. (Note:
this is from the *client* perspective. There may be interesting
locking *inside* the GUI implementation, but this is typically hidden
from clients.) This choice was originally made for performance
reasons; it has persisted due to the clash between the need for
lock-ordering (to avoid deadlock) and the fact that events flow
bi-directionally between the GUI(and the user) on one side, and the
program and system on the other. This makes it extremely challenging
(and perhaps impossible) to create a single consistent lock-ordering.

* The GUI must avoid race conditions, even though its data is
unprotected by locks. The typical solution is thread confinement, to
wit: *only* the GUI event thread may touch the GUI's data structures.
That's a thread usage policy, but not one that's easily expressable
using lock-based annotations. The statements below are consequences
of that policy.

* Code executing on threads other than the GUI event thread must never
invoke GUI methods (actually, most GUI methods; see below). Getting
this wrong breaks the necessary thread confinement. Note that this is
a transitive property which must be true for all code executing on
non-GUI threads.

* GUI callback methods are invoked by the GUI on the GUI event thread,
even though it is client-written code. Consequences include:
** Code invoked from GUI callbacks must transitively invoke only
methods that can legitimately execute on the GUI event thread.
** Data structures that are part of the client's UI implementation
inherit the thread-confinement policy from the GUI.
** GUI callback methods should never be invoked from any thread
*other* than the GUI event thread.
** All of these sub-items must be transitively true for all code
reachable from GUI callback methods.

* Client code should avoid blocking the GUI event thread, because that
hangs the UI. Implications of this include:
** Don't risk waiting on locks that may be held by another
thread. File and network I/O are probably a bad idea. Etc.
** Don't kick off lengthy computations while executing on the event
thread.
** All of these require the ability to "know" which code may be
executed by the event thread.
** Once again, all of these sub-items must be transitively true for
all code reachable from GUI callback methods.

* There are a few GUI functions that may be legitimately invoked from
any thread. This is typically a short list, at least when compared
with the total set of GUI framework methods. Typical examples include
adding and removing event listeners, requesting that some action be
taken on the GUI event thread (Swing calls this method
"invokeLater(...)", other GUIs have other names for it), etc.

All of these properties are entirely policy-based. The typical
lock-based concurrency analysis systems lack the concepts needed to
express or enforce them. The concepts of thread roles and constraints
regarding which roles may execute which code (or touch which specific
data items, if data analysis is supported) enable simple expression
and enforcement of policies like these.

The existing analysis tracks locks, not threads or roles, so
concepts like the "GUI thread" cannot be directly
expressed. However, it's not hard to mimic this functionality -- the
GUI thread acquires a lock named "GUI_lock" when it starts, and
releases it when its done, and all code and data that should be
GUI-only is annotated as requiring that lock. (A lock in clang is
simply a named C++ object; it does not need to implement an actual
lock.) Switching of roles can similarly be modeled as releasing one
lock and acquiring another. So at first glance, it seems to me that
lock-based analysis is roughly similar to the analysis that you
propose, and may in fact be significantly stronger; the "related
work" section in your paper does not discuss the differences in much
detail.

It may be possible to squeeze TRA into the current thread safety
analysis by using "notional" locks in place of thread roles as you
suggest. In what follows, a lock name (e.g., "A") means that the lock
is required; the negation of a lock name (e.g., "!A") means that it is
excluded. The existing Clang annotations support requiring a set of
locks (e.g., (A & B & C) ) and also forbidding a set of locks
(e.g.,(!B & !C)).

However, when you say:

and all code and data that should be GUI-only is annotated as
requiring that lock.

If this statement requires user-written annotations for all such
methods, the required annotation effort will make adoption exceedingly
difficult for most practicing programmers. One of the benefits of TRA
is the application of well-founded techniques to avoid the need for
most user-written annotations. Perhaps some of these techniques could
be brought to bear on the existing thread safety analysis.

TRA offers some additional capabilities:

* We support more complex boolean expressions over thread roles. The
existing Clang annotations cannot express multiple combinations of
such properties. For example, ((A & B & !C) | (D & !E)). Before you
ask, although such combinations would be nonsensical for locks, they
really do show up for thread roles in production code.

* TRA allows statements that certain thread roles are mutually
exclusive (a GUI thread can't be a Compute thread, and vice
versa). The Clang annotations lack support for this property.

* TRA (for code) plus a sufficient effects and/or alias analysis (for
data) can combine to support discovery of data that *should* be locked
(but currently is not), as well as demonstrating that notionally
thread-confined data is, in fact, thread confined. If your existing
effects or alias analysis is strong enough, that's a really exciting
capability. By comparison, the existing thread safety analysis
*starts* when you decide that you need to protect some data with a
lock.

* TRA is strongly focused on reducing the user effort required to get
results. Thus, multiple tricks to reduce the number of annotations to
be written (see the paper). For example, our inference-based cross-TU
analysis avoids the need to annotate each method in a chain of calls
with the equivalent of a shared_locks_required() annotation. See also
the discussion in the paper of analysis of partially-annotated code,
"scoped promises" (NOT related to your scoped_lockable annotation),
and so on.

Our experience with extensive field trials showed that pragmatic
adoptability requires all of:
* Sound results. The existing thread safety analysis is certainly
capably of providing this.
* A very smooth incremental adoption path. I believe that the existing
thread safety analysis provides this, for the properties it
addresses. At a minimum, you can work through an existing code-base
one lock at a time.

* Very low programmer effort required to produce useful results. My
past experience with lock-based analyses suggests that this is mixed.
The purely lock-based part is mostly there. O-O effects and/or alias
analysis, however, mostly fails this test. Particular challenges
arise due to the occasional need to prove properties like exclusive
ownership of heap data (for some cases), or more-precise effects than
"touches the heap" (for other cases).

An important issue to be aware of is that the portion of a program
that needs to know about any particular lock (and the data item
protected by it) is typically a very small fraction of the total
program -- perhaps only a handful of methods. Writing an annotation
for each method in that handful is rarely a substantial
burden. However, interesting programs may contain large numbers of
locks.

Thread roles have very different properties. First, most programs --
even extremely large programs -- tend to involve only a handful of
interesting thread roles, regardless of the number of actual threads
at runtime. The multi-million line programs we analyzed had fewer
than a dozen thread roles, but had thousands of threads. During our
field testing, the largest number of distinct thread roles we
encountered in any one program was under 30.

Secondly, interesting thread roles tend to involve very large swathes
of code. For example, the GUI model described above essentially
partitions the entire program into two bins: GUI code and not-GUI
code. It should be obvious that although:

and all code and data that should be GUI-only is annotated as
requiring that lock

is exactly what we want conceptually, asking programmers to annotate
every method in the entire program is a guaranteed non-starter due to
the amount of effort required. Hence our focus on extremely
light-weight annotations and extremely light-weight analysis (our goal
was less than double the compilation time of whatever we were
analyzing). I believe that this tendency to be relevant to very large
fractions of an application is very different from the typical
use-case for the thread safety analysis, at least in terms of the
annotations to be written (or inferred).

It may, in fact, be possible to combine the capabilities of TRA and
the current thread safety checking to produce something very
interesting. Let's talk about it! But we believe that TRA offers
sufficient benefits to warrant its inclusion as a feature in addition
to the existing lock-based analysis. They're similiar in that they
both deal with thread safety, but solve the problem in different ways.

The existing clang annotations are used extensively on production
code; we have many hundreds of thousands of lines that are currently
annotated and checked. It is not a research project.

While the previous Java system described in the paper was a research
project, it has progressed far past that point. It (along with
the other static analyses produced by members of the Fluid research
group) was used sucessfully on millions of lines of production Java
code (warts and all). The Fluid group's analyses (including TRA) were
subsequently tech-transferred out into industry. In any case, the
Java side is definitely not research at this point.

We're proposing to bring the same concepts over to the C-based family
of languages because the problems TRA solves are mostly
language-agnostic. The extent of the research involved at this point
is in finding the most obvious ways to express the capabilities; the
underlying concept has already been proven useful and effective.

Thank you for the long explanation. However, based on what you’ve said, I do not think that your approach warrants a completely separate analysis, along with a completely separate set of annotations.

In particular, I believe that “lock” and “role” are essentially the same thing. In both cases, the analysis is marking functions and data structures as being “protected” by a particular lock/role, and ensuring that a thread must hold/have the lock/role before calling protected functions, or accessing protected data. Most of the analysis logic (tracking lock/role sets, dealing with pointer aliasing, etc.) will thus be exactly the same. It makes no sense to duplicate both annotations and logic.

In particular, I foresee a time when someone might wish to use both locks and roles in the same program. In that case, it would be much better to have a single analysis that understood both.

That said, you did point out some useful missing features, which are:

We support more complex boolean expressions over thread roles. The
existing Clang annotations cannot express multiple combinations of
such properties. For example, ((A & B & !C) | (D & !E)).

As you mention, the existing analysis does not support this case. However, this functionality would not be hard to add – it’s simply a boolean constraint on allowable lock sets. If you feel it’s valuable, then let’s do that!

TRA (for code) plus a sufficient effects and/or alias analysis (for
data) can combine to support discovery of data that should be locked
(but currently is not). … Thus, multiple tricks to reduce the number of
annotations to be written (see the paper). …

I would classify this under the heading of “annotation inference”. It would, indeed, be extremely useful to have a reliable way to infer the appropriate thread annotations. There is an existing body of literature on thread inference for lock annotations, but it is currently completely unimplemented in clang; that would be a useful contribution.

My general feeling is that role inference and lock inference will end up being very similar algorithms. As you point out, however, there is a difference in scale – there are at most a handful of roles in a program, while there may be many locks. Thus, I am willing to be convinced that your inference algorithm for roles is faster and/or more complete than the corresponding version for locks, which would be reason for a separate implementation.

The thing to be careful of here is that we will need to separate inference from checking, because there are three separate use cases:

  • Case 1: all annotations in source files. (current system)
  • Case 2: inference tool will add annotations to source files.
  • Case 3: inference tool will write annotations to sidecar files, to be used by separate analysis tool.

I will freely admit that I've not read all of this thread (it is *very*
long, where possible brevity would be helpful to get a wider audience), I
wanted to expand on the first email I sent to Aaron in response to this
comment.

I think it is fundamentally important to approach contributing this type of
work to clang as *incremental* improvements to the existing functionality.
It would be very disruptive to the project to have a second analysis system
surrounding thread safety. I'm not saying that the existing functionality
is perfect or must be exactly preserved, I'm just saying that you should
propose new functionality via an incremental path from where we are today.

This may include refactoring what we have today to make it more generic and
usable in both the static analyzer and the analysis based warnings, it may
include finding ways to incrementally add the necessary complexity and
features to the existing logic, and it may even include concerns over
deployment and how users can start benefitting from it.

I think these issues will (to a certain extent) be just as important as the
concerns of basing this on sound academic research, and having a good
theoretical model behind the analysis and diagnostics produced. As an
example, I think one thing that is actively hurting the community in
understanding your proposal is trying to first get an existing community to
shift terminology to that of specific research papers, and then describing
what you want in those terms. Maybe it would be possible to instead use the
existing terminology, or where it isn't good terminology correct the
terminology of the existing system before trying to describe new things in
terms of it?

<Continued, after accidental send>

Having the annotations in source files increases the annotation burden, which makes it hard to apply the analysis to existing code. On the other hand, many of our users rely on explicit annotations as a form of machine-checked documentation, making the thread policy explicit in the code. Case 2 is thus a good middle ground between the current system and what you have proposed. Moreover, once code has been annotated, further checking can be done as part of the normal compile cycle, without having to run a separate tool that does whole-program analysis. I think all three use cases should be supported.

-DeLesley

Annotations can also break portability for those who use multiple
compilers. For example, MSVC cannot understand GCC extensions; and GCC
cannot understand Microsoft's SAL annotations
(http://msdn.microsoft.com/en-us/library/ms235402(v=vs.80).aspx).

Portability is important because it helps us write correct code in
practice. If a program is accepted by Clang, GCC, ICC/ICPC and MSVC,
then its usually correct and not dependent on compiler personalities.

If annotations are required, please provide an option to read them
from an external file.

Jeff

In practice, the annotations are always wrapped in macros, for exactly this reason. However, I agree that having support for separate files would be a useful feature.

In particular, I believe that "lock" and "role" are essentially the same
thing. In both cases, the analysis is marking functions and data structures
as being "protected" by a particular lock/role, and ensuring that a thread
must hold/have the lock/role before calling protected functions, or
accessing protected data.

Dean, perhaps you could expand on how your "roles" fundamentally
differ from "locks"; e.g. how is the statement "Thread A has the GUI
role" fundamentally different from "Thread A notionally holds the GUI
write-lock"; how is "Thread B has the Compute role" fundamentally
different from "Thread B notionally holds a Compute read-lock"?

That said, you did point out some useful missing features, which are:

We support more complex boolean expressions over thread roles. The
existing Clang annotations cannot express multiple combinations of
such properties. For example, ((A & B & !C) | (D & !E)).

As you mention, the existing analysis does not support this case. However,
this functionality would not be hard to add -- it's simply a boolean
constraint on allowable lock sets. If you feel it's valuable, then let's do
that!

For reference, Dean gave an example of an (!A | !B) constraint that
seems very valuable:

Dean Sutherland wrote:

* TRA allows statements that certain thread roles are mutually
exclusive (a GUI thread can't be a Compute thread, and vice
versa). The Clang annotations lack support for this property.

Delesley, does Clang's current lock-based analysis really have no
support for the annotation "No thread can hold both the GUI lock and
the Compute lock at the same time"? You could sort of simulate this by
annotating the invariants "Whenever you call the function lock_GUI(),
you must not hold the Compute lock" and "Whenever you call the
function lock_Compute(), you must not hold the GUI lock", but that's
quite ugly, and also supposes that functions analogous to lock_GUI()
and lock_Compute() exist for every role, which is probably false.

Dean Sutherland wrote:

If this statement requires user-written annotations for all such
methods, the required annotation effort will make adoption exceedingly
difficult for most practicing programmers. One of the benefits of TRA
is the application of well-founded techniques to avoid the need for
most user-written annotations. Perhaps some of these techniques could
be brought to bear on the existing thread safety analysis.

Dean, how does this work? Don't you need annotations somewhere to tell
the analyzer what counts as a "GUI method", for example? You're saying
that Clang's current analyzer would require a ton of annotation to
mark all the GUI methods, which IMHO is true; but how would your
solution avoid that burden?

my $.02,
–Arthur

Delesley, does Clang’s current lock-based analysis really have no
support for the annotation “No thread can hold both the GUI lock and
the Compute lock at the same time”? You could sort of simulate this by

You might be able to hack it, but it would be ugly. For simple cases like this,
though, we could even reuse the existing attributes, e.g. by allowing
LOCKS_EXCLUDED on locks as well as functions:

Mutex gui_lock LOCKS_EXCLUDED(compute_lock);
Mutex compute_lock LOCKS_EXCLUDED(gui_lock);

I haven’t implemented the checks to enforce this, but it would be pretty trivial to add.
More complicated boolean expressions would be hard to express with the existing
attributes, though.

-DeLesley

In particular, I believe that "lock" and "role" are essentially the same
thing. In both cases, the analysis is marking functions and data structures
as being "protected" by a particular lock/role, and ensuring that a thread
must hold/have the lock/role before calling protected functions, or
accessing protected data.

Dean, perhaps you could expand on how your "roles" fundamentally
differ from "locks"; e.g. how is the statement "Thread A has the GUI
role" fundamentally different from "Thread A notionally holds the GUI
write-lock"; how is "Thread B has the Compute role" fundamentally
different from "Thread B notionally holds a Compute read-lock"?

Ostensibly: locks are data-driven, and roles are code-driven. That
being said, you can emulate one using the other. You can gin up a
piece of data to associate the lock with for each given role. So when
you thread_role_grant, you are acquiring that lock, and
thread_role_revoke would release that lock. In this model, requiring a role
for one or more methods would be equivalent to requiring that the lock be
already held for those methods.

One difference between roles and locks is cognitive. Programmers are
taught to lock minimally -- grab the lock only when you need it, and
release it as soon as you're done with it. Keeping your locks tight
reduces contention, etc. Roles go to the other extreme: you define a
role that a thread performs, and stick with it for as long as the role is
relevant. For roles like GUI (e.g., the event thread), that's likely to be a
very long time. It's slightly counter-intuitive for you to grab a notional lock
and hold it for the lifetime of a thread. This really boils down to the idea
that when you lock something, you are protecting a resource (data, or a code
section, etc). Defining roles is more contractual -- it doesn't focus on the
'how' like locking does. You are saying "this method should only ever be used
from this role". Consequently, if that method is ever invoked from another role
(or, equivalently, from code that may be executed by a thread performing
another role), you've broken the contract. Contrast this with locks where it
is expected that the data is going to be touched from two different contexts
(hence the need to protect in the first place).

The underlying analysis technique is useful for specifying and enforcing a
variety of contracts. Thread roles happen to be a high-impact choice.

That said, you did point out some useful missing features, which are:

We support more complex boolean expressions over thread roles. The
existing Clang annotations cannot express multiple combinations of
such properties. For example, ((A & B & !C) | (D & !E)).

As you mention, the existing analysis does not support this case. However,
this functionality would not be hard to add -- it's simply a boolean
constraint on allowable lock sets. If you feel it's valuable, then let's do
that!

For reference, Dean gave an example of an (!A | !B) constraint that
seems very valuable:

Dean Sutherland wrote:

* TRA allows statements that certain thread roles are mutually
exclusive (a GUI thread can't be a Compute thread, and vice
versa). The Clang annotations lack support for this property.

Delesley, does Clang's current lock-based analysis really have no
support for the annotation "No thread can hold both the GUI lock and
the Compute lock at the same time"? You could sort of simulate this by
annotating the invariants "Whenever you call the function lock_GUI(),
you must not hold the Compute lock" and "Whenever you call the
function lock_Compute(), you must not hold the GUI lock", but that's
quite ugly, and also supposes that functions analogous to lock_GUI()
and lock_Compute() exist for every role, which is probably false.

Dean Sutherland wrote:

If this statement requires user-written annotations for all such
methods, the required annotation effort will make adoption exceedingly
difficult for most practicing programmers. One of the benefits of TRA
is the application of well-founded techniques to avoid the need for
most user-written annotations. Perhaps some of these techniques could
be brought to bear on the existing thread safety analysis.

Dean, how does this work? Don't you need annotations somewhere to tell
the analyzer what counts as a "GUI method", for example? You're saying
that Clang's current analyzer would require a ton of annotation to
mark all the GUI methods, which IMHO is true; but how would your
solution avoid that burden?

Section 5.2 of the PPoPP paper gives the full explanation. In brief, it runs
off the idea that annotations are propagated from one function to another along
the call graph. Eg)

void h() {}
void g() { h(); }

[[cert::thread_role("GUI")]]
void f() { g(); }

When analyzing the flow of f, g and h are automatically given the GUI role
through propagation. This lets us infer the missing annotations without
additional user effort. The inference happens via iterative flow analysis with
Boolean expressions for the lattice values, and logical OR as the join
operator. This is an ordinary inference analysis. The unusual part is that we
perform it over quite large swathes of code (call them "modules"). The enabling
conditions are that (1) all directly-invoked entry points to the module are
known at analysis time, (2) all indirect calls are adequately covered by thread
role annotations (by inheritance for OO, or by suitable genericity for function
pointers), and (3) all code that is "inside" the module is presented for
analysis (that is, the module is "sealed"). As you might expect, the user writes
annotations fort the visible interface of the module; the tool infers (most of) the
annotations inside the module. As with most such inference
algorithms worst-case BigO is exponential, but this is never seen in practice.
Typical case is linear in program size! We're happy to provide the explanation
off-line -- this message is already way too large.

Past experiments show that dividing up programs in this fashion greatly reduces
the number of user-written annotations. An intelligent division yields best
results, but even random division helps. Previous results show that the
visible interface between modules (when grouped intelligently) is typically <8%
of the functions (and data declarations) that were visible across TUs. Random
groupings had about 25% of the functions as visible interface. That's a
4x-12x reduction in the number of annotations to be written! See the PPoPP paper
for descriptions of the other annotation reduction techniques.

TRA and the current thread safety checking seem to us to be different bike sheds, both of which are interesting. For more, see below.

[SNIP]

The single biggest difference between the existing lock analysis and
thread role analysis is that we’re addressing different (but
partly-overlapping) issues.

Lock analysis starts with some data that must be protected from
multi-threaded access. You then annotate the data to indicate the
specific lock that protects it, annotate methods to indicate how they
interact with the locks. Analysis involves tracking data references
(which is the really hard part!) and then running the lock-set algorithm
to make sure you never have an unprotected access. This is great stuff!
This ensures that a program has correct and consistent use of locks and
protection of data.

Thread role analysis operates on policy models such as “Only the GUI
event thread may invoke [long-list-of-methods].” This is a common
idiom in many widely-used frameworks, for example AppKit’s policy
of always operating from the primary thread. Many of these policies exist
because fine-grained locking was too difficult, too expensive, or even
impossible—perhaps due to deadlock potential—for the problem at hand.
Other frameworks use a “no concurrency” model, outsourcing concurrency from
the application to the framework. Although this serves to provide thread
confinement of data, it also places restrictions on what functions may be
invoked, independent of the data those objects may touch. For example: such
code may never start a thread running, grab a thread from a pool, etc. The
Actor design pattern provides an example of the “no concurrency in the
client” approach.

Obviously, these two analyses are related. Policy-based concurrency
sometimes serves as a proxy for lock-based concurrency. But policy-based
concurrency is a subset of the use-case for TRA. When programmers know
which thread roles may execute a particular function, this provides
some guidance regarding what they can or should invoke from that
function. For example, an astronomy application we analyzed
compartmentalized things like computation, printing, disk and network i/o,
etc. onto individual threads. This separation was partly for GUI
responsiveness, performance and maintenance, all of which has limited
relation to locking. The thread roles were purely a useful abstraction for
enforcing developer policy.

We feel that TRA and lock-based analysis are complementary, but
different, systems that can benefit from each other. There’s nothing
inherently disruptive about TRA to the overall architecture of clang
(it’s mostly making use of existing mechanisms); it can be implemented
in an incremental fashion for easier community involvement.
Ultimately, we believe there are considerable benefits to integrating
the two systems together, but there does not appear to be enough similarity
to attempt to use the existing lock-based infrastructure as a jumping-off
point.

Dean Sutherland
(with much help from Aaron Ballman)

I understand that locks and roles are different in some sort of high-level cognitive sense. However, I believe that the basic annotations and analysis are exactly the same; the difference lies only in how they are used. In particular, you need to:

(a) Mark data as being protected by a lock/role.
(b) Mark functions as being protected by a lock/role.
(c) Identify when a thread acquires and releases a lock/role.
(c) Warn whenever a data/function is accessed/called without holding the proper lock/role.

Roles are globally defined, acquired on thread creation, and seldom or never released, while locks may be stored in objects and are much more fine-grained. However, this indicates to me that roles are simply a subset of locks, not a completely different thing.

I very strongly believe that we should have one set of annotations, and one analysis for doing these checks. I do not want to introduce a second set of annotations that does exactly the same thing, or a second analysis that is almost like the first, but different in some subtle way. That will only be confusing to people who are actually trying to use this stuff for practical programming.

Thread role analysis operates on policy models such as “Only the GUI event thread may invoke [long-list-of-methods].”

Lock analysis establishes the same sort of policy. You must hold ‘lock’ to call [long list of methods].

Section 5.2 of the PPoPP paper gives the full explanation. In brief, it runs
off the idea that annotations are propagated from one function to another along
the call graph. Eg)

Lock annotations are propagated in exactly the same way. The difference is that the current system does not infer missing annotations, so it will issue a warning instead:

void h() LOCKS_REQUIRED(mu_) {}
void g() { h(); } // warning – missing LOCKS_REQUIRED(mu_) annotation

There are two reasons why I have not attempted to do inference yet. First, inference requires whole-program analysis, which is incompatible with compiling by translation unit. And second, with locks, it is impossible to determine whether the programmer is supposed to acquire mu_ within g(), or whether g() should be marked with LOCKS_REQUIRED. With roles, you know that it’s probably missing the annotation.

I propose a path forward to resolve these issues.

(1) Reuse the current annotations as much as possible. If there are things that just can’t be done with the existing annotations (you mentioned complex boolean conditions on roles), then let’s come up with a minimal extension to support what you want to do. I suspect that many such extensions would actually be valuable for locks as well. Most of the existing annotations, GUARDED_BY, LOCKS_REQUIRED, etc. can be directly mapped onto your system, and can be re-used as-is.

(2) Implement role inference as a separate, whole-program analysis pass, using sidecar files etc. I am willing to believe that role inference may be different from lock inference, although I’m not entirely convinced that the two cannot be merged. But it seems to me that inference is what really distinguishes your work from the existing system, so let’s put it in a separate tool.

(3) Reuse the current analysis as much as possible. IMHO, it is always better to logically separate inference from checking; checking assumes a fully annotated program, while inference will infer missing annotations. Your inference pass should thus supply the missing annotations, and then invoke the existing analysis to issue the warnings. If the existing analysis needs to be extended, refactored, or fixed in some way in order to make this work, then let’s do that; our goal is to keep the two analyses in sync.

there does not appear to be enough similarity
to attempt to use the existing lock-based infrastructure as a jumping-off
point.

I completely disagree. If one can be mapped into the other, as you yourself have said, then how are they not similar?

-DeLesley

If I'm not mistaken, Clang's annotations are specific to C++ while
Sutherland seems to target C11 initially. How difficult would it be to
generalize Clang's annotations to support C?

BR
Magnus Reftel

Clang’s annotations are not specific to C++. They should support C already, although I should probably add a separate set of test cases to target C in particular.

-DeLesley

I'd like to apologize to DeLesley and the list for the slowness of this response. I wanted to take the time to provide a high-quality response, so as not to waste people's time.

First an overall point that seems worthy of emphasis: Of course we intend to share as much with existing annotations and machinery as is plausible, diverging only where it makes sense. Meanwhile, we appear to be wrestling over "where does it make sense to diverge?" After reading your feedback, it seems that I've succeeded in getting across the similarities between Thread Role Analysis (TRA, hereafter) and Lock Analysis (LA, hereafter), but I've failed to make the differences clear. That failure has led to a misunderstanding of the issues involved.

This message focuses on two user-visible aspects of the analysis, regarding the annotations and use-cases.
(1) Similarity in form isn't a good enough argument *by itself* to force similar things to be the same. Providing tools that match their expressive metaphor to the user's mental model *really matters* for adoptability. If this weren't true, we'd have only one programming language rather than many.
(2) TRA and LA are optimized for different problems and express different mental models.

Until we reach agreement on these user-visible issues, there's not much point in discussing implementation issues, so I'm deferring discussion of implementation issues to subsequent messages.

I am dealing with a family emergency right now, so I apologize for being somewhat brief.

Every legal program in every programming language can be mapped into assembly code…

To draw a different analogy that is perhaps somewhat more relevant to this list than either asm or screwdrivers, there are many different languages that can be mapped into LLVM IR.
Is LLVM an ideal IR for every language? Obviously not. It’s too focused on C/C++, has terrible GC support, no continuations, etc. And yet, the benefits of having a common compiler intermediate language are enormous (reuse of optimization passes, back-ends, etc.)
As a result, many language implementers have chosen to work within the limitations of the LLVM IR, even if it is not a perfect fit.

You and I seem to have fundamentally different goals here. My goal is to collaborate, and create a common analysis framework for thread safety. I would like to identify those areas where the existing framework is not sufficient to meet your needs, and to extend the existing framework until it does meet your needs.

Your goal seems to be to avoid collaboration. If that is indeed your goal, then perhaps a separate stand-alone tool or a clang plugin would be a more suitable approach; that would give you much greater freedom to implement your personal vision of what the analysis should look like.

-DeLesley

abstraction too high. What you describe as user-visible issues are,
user-visible issues are:

  1) What is the interface for annotating my code for LA or TRA?
  2) How do I enable the checking?
  3) Can I perform inference for the annotations, and if so, how?
  4) What form do the resulting diagnostics take?

I think the answers here are similar, but different, for TRA and LA:

(4) A high quality user experience would require different diagnostic
messages for TRA versus LA. We shouldn't tell users that they need to
acquire a lock if the real problem is that their thread doesn't have
the right role.
(1) The desired difference in diagnostics implies that the analysis
needs to know whether a region is guarded by a mutex or by a role,
which implies that the attributes used for the two cases are somehow
different.
(2) -Wthread-safety seems a reasonable warning flag for both cases.
(3) Any design for cross-function (and possibly
cross-translation-unit) attribute inference should really be the same
for both analyses. However, I strongly believe that any effort, or
even discussion, in this direction should be a separate topic from the
basic TRA annotations.

So I think the thing to discuss is, what should the concrete syntax
for specifying TRA annotations be? The obvious flavors of approach
seem to be:
* just use the existing LA annotations as-is (this seems likely to be
suboptimal from a diagnostic-quality perspective)
* use the existing LA annotations where possible, but add a minimal
set of additional attributes where necessary
* add a new set of annotations parallel to the LA annotations
* design a new set of annotations, for which both LA and TRA are
special cases, such that we can provide high-quality diagnostics for
both and present a consistent interface for them.

It's far from clear to me which of these you are proposing (and these
are just points on a continuum of options), so a much more specific
proposal would really help.

I'm really sorry that we've given this impression, because that's
definitely not our goal. Quite the opposite, actually! We want to
reuse as much infrastructure as possible while exposing a different
metaphor to the user than what already exists. The under the hood
stuff should most definitely be reused as much as possible, and the
user-facing pieces should feel cohesive with what already exists. I'm
pretty sure we're all on the side for this. :slight_smile:

We're going to try to take this offline for further discussion and
evaluation and hopefully come back with a more clear proposal in the
coming months.

~Aaron

I'm really sorry that we've given this impression, because that's
definitely not our goal. Quite the opposite, actually! We want to
reuse as much infrastructure as possible while exposing a different
metaphor to the user than what already exists.

I apologize for jumping to conclusions; I couldn't tell from your post
that there was anything you wanted to reuse. I think you have some
really great ideas, and I would like to incorporate them into clang's
thread safety analysis if we can; we'll talk about it later offline.

  -DeLesley

I think Richard has some really excellent points here, and I agree
with most of what he says.

I wanted to point out that we can distinguish between locks and roles
by means of a single new attribute: locks are currently declared with
a LOCKABLE attribute on the type; roles could be declared with
THREAD_ROLE attribute on the type. That would let us provide
different error messages for threads and roles, and customize the
analysis if there are some assumptions that are appropriate for locks
but not for roles, or vice versa.

In terms of concrete syntax, I vote for Option 4: design a set of
annotations for which both LA and TRA are special cases. Although
there are differences between the models, there's also a lot of
overlap, and I would like to exploit that overlap as much as possible.

  -DeLesley