[RFC] nolock and noalloc attributes

dougsonos · February 6, 2024, 8:12pm

Introduction

Background and motivation

In audio software, there is commonly a realtime thread processing audio input and/or producing audio output. This thread runs periodically, typically at an interval ranging from 0.7 to 100 milliseconds. Especially at shorter I/O intervals, it is crucial that this thread perform its work to completion without blocking; blocking can cause the thread to miss its deadline, causing a perceptible interruption (“glitch”) in the audio output.

Even experienced audio programmers are prone to incomplete understandings of which system functions may block. It is also easy to accidentally create blocking code paths.

Realtime-safety bugs often go undetected because the blocking behavior is rare and difficult to reproduce.

Thus it is highly desirable to annotate realtime code to indicate which functions are expected to run without blocking, and have the compiler issue warnings (which can be treated as errors) when a non-blocking function can be analyzed as potentially blocking.

There are other performance and safety-critical applications with code paths that must not allocate memory. For such applications, it is highly desirable to annotate code paths as not allocating, and for the compiler to diagnose policy violations.

Annotating performance constraints using attributes

This document proposes new Clang attributes, nolock and noalloc, which can be attached to functions and function types. The attributes identify code which must not allocate memory or lock, and the compiler uses the attributes to verify these requirements.

The concept and names derive from the Swift Performance Annotations proposal, where the attributes are named @noLocks and @noAllocations. The names nolock and noalloc are intended to parallel the Swift names while being consistent with C++'s noexcept.

Following the prior art of the Swift attributes, this document uses the term “performance constraints” to refer collectively to the nolock and noalloc attributes.

The `nolock` and `noalloc` attributes

Attribute syntax

The nolock and noalloc attributes apply to function types, allowing them to be attached to functions, blocks, function pointers, lambdas, and member functions.

// Functions
void noLockFunction() [[clang::nolock]];
void noAllocFunction() [[clang::noalloc]];

// Function pointers
void (*noLockFunctionPtr)() [[clang::nolock]];

// Typedefs, type aliases.
typedef void (*NoLockFunctionPtrTypedef)() [[clang::nolock]];
using NoLockFunctionPtrTypeAlias_gnu = __attribute__((nolock)) void (*)();
using NoLockFunctionPtrTypeAlias_std = void (*)() [[clang::nolock]];

// C++ methods
struct Struct {
	void noLockMethod() [[clang::nolock]];
};

// C++ lambdas
auto noLockLambda = []() [[clang::nolock]] {};

// Blocks
void (^noLockBlock)() = ^() [[clang::nolock]] {};

The attribute applies only to the function itself. In particular, it does not apply to any nested functions or declarations, such as blocks, lambdas, and local classes.

This document uses the C++/C23 syntax [[clang::nolock]], since it parallels the placement of the noexcept specifier, and the attributes have many other similarities to noexcept. The GNU __attribute__((nolock)) syntax is also supported. Note that it requires a different placement on a C++ type alias.

Like noexcept, nolock and noalloc have an optional argument, a compile-time constant boolean expression. By default, the argument is true, so [[clang::nolock(true)]] is equivalent to [[clang::nolock]], and declares the function type as never locking.

Attribute semantics

Together with noexcept, the noalloc and nolock attributes define an ordered series of performance constraints. From weakest to strongest:

noexcept (as per the C++ standard): The function type will never throw an exception.
noalloc: The function type will never allocate memory on the heap, and never throw an exception.
nolock: The function type will never block on a lock, never allocate memory on the heap, and never throw an exception.

nolock includes the noalloc guarantee.

nolock and noalloc include the noexcept guarantee, but the presence of either attribute does not implicitly specify noexcept. (It would be inappropriate for a Clang attribute, ignored by non-Clang compilers, to imply a standard language feature.)

nolock(true) and noalloc(true) apply to function types, and by extension, to function-like declarations. When applied to a declaration with a body, the compiler verifies the function, as described in the section “Analysis and warnings”, below. Functions without an explicit performance constraint are not verified.

nolock(false) and noalloc(false) can be used on a function-like declaration. They are equivalent to the attribute not being present, except that they also disable any potential inference of the attribute during verification. (Inference is described later in this document). nolock(false) and noalloc(false) are legal, but superfluous when applied to a function type. float (int) [[nolock(false)]] and float (int) are identical types.

For all functions with no explicit performance constraint, the worst is assumed, that the function allocates memory and potentially blocks, unless it can be inferred otherwise, as described in the discussion of verification.

The following table describes the meanings of all permutations of the two attributes and arguments:

	`nolock(true)`	`nolock(false)`
`noalloc(true)`	valid; `noalloc(true)` is superfluous but does not contradict the guarantee not to lock	valid; the function does not allocate memory, but may lock for other reasons
`noalloc(false)`	error, contradictory	valid

Type conversions

A performance constraint can be removed or weakened via an implicit conversion. An attempt to add or strengthen a performance constraint is unsafe and results in a warning.

void unannotated();
void nolock() [[clang::nolock]];
void noalloc() [[clang::noalloc]];

void example()
{
  // It's fine to remove a performance constraint.
  void (*fp_plain)();
  fp_plain = unannotated;
  fp_plain = nolock;
  fp_plain = noalloc;

  // Adding/spoofing nolock is unsafe.
  void (*fp_nolock)() [[clang::nolock]];
  fp_nolock = nolock;
  fp_nolock = unannotated;
  // ^ warning: cannot convert a non-'nolock' function to a 'nolock' function
  fp_nolock = noalloc;
  // ^ warning: cannot convert a non-'nolock' function to a 'nolock' function

  // Adding/spoofing noalloc is unsafe.
  void (*fp_noalloc)() [[clang::noalloc]];
  fp_noalloc = noalloc;
  fp_noalloc = nolock; // no warning because nolock includes noalloc
  fp_noalloc = unannotated;
  // ^ warning: cannot convert a non-'noalloc' function to a 'noalloc'
  //     function
}

Note that noexcept behaves very similarly: when present, it is part of a function’s type. One can remove noexcept when assigning to a function pointer, but not add it:

void throwing();
void nonThrowing() noexcept;

void example()
{
  // It's okay to remove noexcept.
  void (*fp_plain)();
  fp_plain = throwing;
  fp_plain = nonThrowing;

  // It is an error to add/spoof noexcept.
  void (*fp_noexcept)() noexcept;
  fp_noexcept = nonThrowing;
  fp_noexcept = throwing;
  // error: assigning to 'void (*)() noexcept' from
  //   incompatible type 'void ()': different exception specifications
}

Virtual methods

In C++, when a base class’s virtual method has a performance constraint, overriding methods in subclasses must be declared to have either the same attribute or a more constrained one. For example, a base class’s noalloc(true) or nolock(false) method can be overridden by a derived class’s nolock(true) method, but not the inverse.

struct Base {
  virtual void unsafe();
  virtual void safe() noexcept [[clang::nolock]];
};

struct Derived : public Base {
  void unsafe() [[clang::nolock]] override;
  // It's okay for an overridden method to be more constrained

  void safe() noexcept override;
  // warning: performance constraint of overriding function is more lax 
  //   than base version
};

This too parallels noexcept, which issues an error, “exception specification of overriding function is more lax than base version”.

Redeclarations, overloads, and name mangling

The nolock and noalloc attributes, like noexcept, do not factor into argument-dependent lookup and overloaded functions/methods.

First, consider that noexcept is integral to a function’s type:

void f1(int);
void f1(int) noexcept;
// error: exception specification in declaration does not match previous
//   declaration

Unlike noexcept, a redeclaration of f2 with an added or stronger performance constraint is legal, and propagates the attribute to the previous declaration:

int f2();
int f2() [[clang::nolock]]; // redeclaration with stronger constraint is OK.

This greatly eases adoption, by making it possible to annotate functions in external libraries without modifying library headers.

A redeclaration with a removed or weaker performance constraint produces a warning, in order to parallel the behavior of noexcept:

int f2() { return 42; }
// warning: performance constraint on function is more lax than on previous
//   declaration

In C++14, the following two declarations of f3 are identical (a single function). In C++17 they are separate overloads:

void f3(void (*)());
void f3(void (*)() noexcept);

The following two declarations of f4 refer to a single function, since the attribute is not part of the canonical function pointer type.^[1] Generally the redeclaration (second one) will be the one found by callers.^[2]

void f4(void (*)());
void f4(void (*)() [[clang::nolock]]);

The attributes have no effect on the mangling of function and method names.

`noexcept`

nolock and noalloc are conceptually similar to a stronger form of C++'s noexcept, but with further diagnostics, as described later in this document. Therefore, in C++, a nolock or noalloc function, method, block or lambda should also be declared noexcept.^[3] If noexcept is missing, a warning is issued. In Clang, this diagnostic is controlled by -Wperf-constraint-implies-noexcept.

Source compatibility

The nolock and noalloc attributes are not 100% source-compatible, since they are part of the type system. Whether they are implemented as shadow types (“type sugar”) or canonical types is an open question, discussed later in the document.

For example, despite the implicit conversions described above:

auto* fp = &nolock_function(); // type of fp is: void (*)() [[clang::nolock]]
fp = &locking_function();      
// ^ warning: cannot convert a non-'nolock' function to a 'nolock' function

The use of the attributes in conjunction with auto and other sources of implicit types are expected to be the main cause of source incompatibilities.

Objective-C

The attributes are currently unsupported on Objective-C methods.

Analysis and warnings

Constraints

Functions declared noalloc(true) or nolock(true), when defined, are verified according to the following rules. Such functions:

May not allocate or deallocate memory on the heap. The analysis follows the calls to operator new and operator delete generated by the new and delete keywords, and treats them like any other function call. The global operator new and operator delete aren’t declared nolock or noalloc and so they are considered unsafe. (This is correct because most memory allocators are not lock-free. Note that the placement form of operator new is implemented inline in libc++'s <new> header, and is verifiably nolock, since it merely casts the supplied pointer to the result type.)
May not throw or catch exceptions. To throw, the compiler must allocate the exception on the heap. (Also, many subclasses of std::exception allocate a std::string). Exceptions are deallocated when caught.
May not make any indirect function call, via a virtual method, function pointer, or pointer-to-member function, unless the target is explicitly declared with the same nolock or noalloc attribute (or stronger).
May not make direct calls to any other function unless either:
- the callee is also explicitly declared with the same nolock or noalloc attribute (or stronger).
- the callee is defined in the same translation unit as the caller, does not have the false form of the required attribute, and can be verified to be have the same attribute or stronger, according to these same rules.
May not invoke or access an Objective-C method or property (via ObjCMessageExpr), since objc_msgSend calls into the Objective-C runtime, which may allocate memory or otherwise block.

Functions declared nolock(true) have an additional constraint:

May not declare static local variables (e.g. Meyers singletons). The compiler generates a lock protecting the initialization of the variable.

Violations of any of these rules result in warnings:

void notInline();

void example() [[clang::nolock]]
{
	auto* x = new int;
	// warning: 'nolock' function 'example' must not allocate or deallocate
	//   memory

	if (x == nullptr) {
	  static Logger* logger = createLogger();
	  // warning: 'nolock' function 'example' must not have static locals

	  throw std::runtime_warning{ "null" };
	  // warning: 'nolock" function 'example' must not throw exceptions
	}
	notInline();
	// warning: 'nolock' function 'example' must not call unsafe function
	//   'notInline'
	// note: 'notInline' is unsafe because it is externally defined and not
	//   declared 'nolock'
}

Inferring `nolock` or `noalloc`

In the absence of a nolock or noalloc attribute (whether true or false), a function, when found to be called from a performance-constrained function, can be analyzed to determine whether it has a desired attribute. This analysis happens when:

the function is not a virtual method
it has a visible definition within the current translation unit (i.e. its body can be traversed).

void notInline();
int implicitlySafe() { return 42; }
void implicitlyUnsafe() { notInline(); }

void example() [[clang::nolock]]
{
	int x = implicitlySafe(); // OK
	implicitlyUnsafe();
	// warning: 'nolock' function 'example' must not call unsafe function
	//   'implicitlyUnsafe'
	// note: 'implicitlyUnsafe' is unsafe because it calls unsafe function
	//   'notInline'
	// note: 'notInline' is unsafe because it is externally defined and not
	//   declared 'nolock'
}

With the ability to infer nolock/noalloc, large libraries require far fewer annotations.

For example, without inference, unique_ptr<Foo>::operator->() would need an explicit attribute, despite being trivially verifiable:

void example1(std::unique_ptr<Foo> foo_up) [[clang::nolock]] {
	int x = foo_up->bar;
}

Annotating the STL with nolock/noalloc attributes would be especially problematic for templated functions. The attribute would need to able to be contingent on a type trait, a way to express “this method is nolock if this other method it calls is nolock”. For example, the copy constructor of std::optional<T> would need be annotated with a type trait which evaluates whether T’s copy constructor is nolock.

void example2(std::optional<int> optInt, std::optional<std::string> optString)
  [[clang::nolock]] {
	auto x = optInt;    // Call to optional<int>'s copy constructor, safe
	auto y = optString; // Call to optional<string>'s copy constructor, unsafe
}

Such annotations would likely also have to be duplicated for nolock and noalloc.

Further, the standard library makes pervasive use of callable template arguments.

void example3(std::vector<Foo>& vec) [[clang::nolock]] {
	std::sort(vec.begin(), vec.end(), [](const auto& a, const auto& b) {
		return a.member < b.member;
	});
}

Here too, inference would save std::sort from a complex type-dependent attribute describing the comparator.

Beyond the STL, most large C++ codebases have many trivial and inferably nolock/noalloc inline accessor methods. The burden of manually annotating nolock/noalloc functions could be so high as to be prohibitive. In effect, inference provides automatic annotation with attributes.

Again, it is useful to compare nolock/noalloc with noexcept. Clang currently warns if a function declared noexcept contains a throw. It does not currently have the ability to diagnose situations where a callee of a noexcept function throws, but that seems potentially desirable, and such a diagnostic could reuse much of the nolock/noalloc analysis infrastructure.^[4]

Risks of inference

Inference does however introduce a risk. A user of a library could write nolock code that depends on a certain inline function of the library being inferably nolock. The library could be revised so that the function is de-inlined, or reimplemented in a way that is not inferably nolock. This would break client code.

As a practical matter, however, nolock/noalloc code tends to minimize its dependencies, limiting them to library features which either provide documented guarantees of lock-freedom or allocation-freedom, or which are sufficiently simple that it is reasonable to rely on those implementation details, when exposed as inline methods.

Lambdas and blocks

As mentioned earlier, the performance constraint attributes apply only to a single function and not to any code nested inside it, including blocks, lambdas, and local classes. It is possible for a lock-free function to schedule the execution of a blocking lambda on another thread.^[5] Similarly, a blocking function may create a nolock lambda for use in a realtime context.

Operations which create, destroy, copy, and move lambdas and blocks are analyzed in terms of the underlying function calls. For example, the creation of a lambda with captures generates a function call to an anonymous struct’s constructor, passing the captures as parameters.

Implicit function calls in the AST

The nolock/noalloc analysis occurs at the Sema phase of analysis in Clang. During Sema, there are some constructs which will eventually become function calls, but do not appear as function calls in the AST. For example, auto* foo = new Foo; becomes a declaration containing a CXXNewExpr which is understood as a function call to the global operator new (in this example), and a CXXConstructExpr, which, for analysis purposes, is a function call to Foo’s constructor. Most gaps in the analysis would be due to incomplete knowledge of AST constructs which become function calls.

Clang’s built-in functions

Clang’s built-in functions are considered nolock by default; this stems from __builtin_assume and many math intrinsics being safe, but there are some unsafe intrinsics which require special treatment.^[6]

Disabling diagnostics

The diagnostics specific to nolock and noalloc are controlled by a new warning group, -Wperf-constraints.

A construct like this can be used to exempt code from verification of performance constraints:

#define NOLOCK_UNSAFE(...)                                           \
	_Pragma("clang diagnostic push")                                 \
	_Pragma("clang diagnostic ignored \"-Wunknown-warning-option\"") \
	_Pragma("clang diagnostic ignored \"-Wperf-constraints\"")       \
	__VA_ARGS__                                                      \
	_Pragma("clang diagnostic pop")

Disabling the diagnostic allows for:

constructs which do block, but which in practice are used in ways to avoid unbounded blocking, e.g. a thread pool with semaphores to coordinate multiple realtime threads.
using libraries which are safe but not yet annotated.
incremental adoption in a large codebase.

Adoption

Adopting the nolock attribute in several large codebases has identified many long-standing realtime-safety errors, and regularly detects errors in newly-written code.

There are a few common issues that arise when adopting the nolock and noalloc attributes.

C++ exceptions

Exceptions pose a challenge to the adoption of the performance constraints. Common library functions which throw exceptions include:

Method	Alternative
`std::vector<T>::at()`	`operator[](size_t)`, after verifying that the index is in range.
`std::optional<T>::value()`	`operator*`, after checking `has_value()` or `operator bool()`.
`std::expected<T, E>::value()`	Same as for `std::optional<T>::value()`.

Interactions with type-erasure techniques

std::function<R(Args...)> illustrates a common C++ type-erasure technique. Using template argument deduction, it decomposes a function type into its return and parameter types. Additional components of the function type, including noexcept, nolock, noalloc, and any other attributes, are discarded.

Standard library support for these components of a function type is not immediately forthcoming.

Code can work around this limitation in either of two ways:

Avoid abstractions like std::function and instead work directly with the original lambda type.
Create a specialized alternative, e.g. nolock_function<R(Args...)> where all function pointers used in the implementation and its interface are nolock.

As an example of the first approach, when using a lambda as a Callable template parameter, the attribute is preserved:

std::sort(vec.begin(), vec.end(),
  [](const Elem& a, const Elem& b) [[clang::nolock]] { return a.mem < b.mem; });

Here, the type of the Compare template parameter is an anonymous class generated from the lambda, with an operator() method holding the nolock attribute.

A complication arises when a Callable template parameter, instead of being a lambda or class implementing operator(), is a function pointer:

static bool compare_elems(const Elem& a, const Elem& b) [[clang::nolock]] {
	return a.mem < b.mem; };

std::sort(vec.begin(), vec.end(), compare_elems);

Here, the type of compare_elems is decomposed to bool(const Elem&, const Elem&), without nolock, when forming the template parameter. This can be solved using the second approach, creating a specialized alternative which explicitly requires the attribute. In this case, it’s possible to use a small wrapper to transform the function pointer into a functor:

template <typename>
class nolock_fp;

template <typename R, typename... Args>
class nolock_fp<R(Args...)> {
public:
	using impl_t = R (*)(Args...) [[clang::nolock]];

private:
	impl_t mImpl{ nullptr_t };
public:
	nolock_fp() = default;
	nolock_fp(impl_t f) : mImpl{ f } {}

	R operator()(Args... args) const
	{
		return mImpl(std::forward<Args>(args)...);
	}
};

// deduction guide (copied from std::function)
template< class R, class... ArgTypes >
nolock_fp( R(*)(ArgTypes...) ) -> nolock_fp<R(ArgTypes...)>;

// --

// Wrap the function pointer in a functor which preserves `nolock`.
std::sort(vec.begin(), vec.end(), nolock_fp{ compare_elems });

Now, the nolock attribute of compare_elems is verified when it is converted to a nolock function pointer, as the argument to nolock_fp’s constructor. The template parameter is the functor class nolock_fp.

Static local variables

Static local variables are often used for lazily-constructed globals (Meyers singletons). Beyond the compiler’s use of a lock to ensure thread-safe initialization, it is dangerously easy to inadvertently trigger initialization, involving heap allocation, from a nolock or noalloc context.

Generally, such singletons need to be replaced by globals, and care must be taken to ensure their initialization before they are used from nolock or noalloc contexts.

Annotating libraries

It can be surprising that the analysis does not depend on knowledge of any primitives; it simply assumes the worst, that all function calls are unsafe unless explicitly marked as safe or able to be inferred as safe. With nolock, this appears to suffice for all but the most primitive of spinlocks.

At least for an operating system’s C functions, it is possible to define an override header which redeclares safe common functions (e.g. pthread_self()) with the addition of nolock. This helps in adopting the feature incrementally.

It also helps that for many of the functions in <math.h>, Clang generates calls to built-in functions, which are assumed to be safe.

Once the feature is integrated into the compiler, attributes can be integrated into SDK headers.

Much of the C++ standard library consists of inline templated functions which work well with inference. Some primitives may need explicit nolock/noalloc attributes.

Clang implementation details

Support for the nolock and noalloc attributes and their diagnostics has been prototyped, largely as described in this document.

Attribute representation

The attributes have been prototyped with two different implementations:

As part of a function’s canonical type, in FunctionProtoType.
As type sugar, using AttributedType.

There is an open question of which implementation to use. The AttributedType implementation exposes at least two pre-existing bugs in Clang, where type attributes are lost when working with auto and inferred lambda return types. The FunctionProtoType implementation is possibly preferable given the C++ committee’s experience with noexcept (initially it was, in effect, sugar, now it is part of a canonical type). This implementation does, however, seem more prone to undesired consequences.

nolock(true) and noalloc(true) are parsed as type attributes. In implementation 2, they are represented as type sugar, using AttributedType. Therefore canonical types have no representation of the attribute, but Type has a method returning a PerfConstraint, an enum class with values None, NoAlloc, and NoLock. If either nolock(false) or noalloc(false) is present on the type, getPerfConstraint() returns PerfConstraint::None.

In implementation 1, the type attributes are stored in one of the bitfields within FunctionProtoType, becoming part of the canonical type.

FunctionDecl and BlockDecl have getPerfConstraint() methods which simply delegate to the Type.

Type-checking

In implementation 2, there are type-checks in any implicit conversion in Sema, implemented roughly in parallel to the checks for nullability. In implementation 1, these same checks happen in different places (in fact, two separate places for C vs. C++ because of differing language rules for pointer conversions).

There are further checks to deal with redeclarations, consistency of overriding virtual method declarations, and noexcept, as described earlier in this document.

Verification

The analysis to verify nolock/noalloc functions happens at the end of Clang’s Sema pass, in AnalysisBasedWarnings. (Currently it does an AST traversal to find all attributed function bodies, but it would be better to build a vector of these functions as they are parsed, and then iterate through that vector.)

For each nolock/noalloc function with a body, the analyzer traverses the body. A construct such as throw is immediately diagnosed as a warning. A call to a callee with the required attribute is safe. A direct call to a callee lacking the attribute results in a recursive analysis of that function, if inference is possible. Without the required attribute or successful inference, the call is determined unsafe and a diagnostic emitted.

The analysis pass’s state is represented in a map from Decl* (FunctionDecl or BlockDecl) to a small struct holding a previous analysis result (either success or the nature and source location of the first unsafe construct found, used in generating notes to explain diagnostics).

Comparison with `enforce_tcb`

Clang has __attribute__((enforce_tcb(tcbName))), which provides basic verification that function calls made within a trusted computing base only call other functions within the TCB. The attribute is attached to function declarations and is not part of a function’s type, so it is unable to diagnose indirect calls.

enforce_tcb has the concept of a “leaf” function, which is safe for other functions in the TCB to call, but is permitted to make unsafe calls outside of the TCB.

Annotating lock implementations

Clang has a set of attributes for Thread Safety Analysis. They identify methods as acquiring, releasing and requiring resources, which are typically mutexes. It may be desirable and possible for such methods to be implicitly marked as nolock(false).

There is an open question of whether nolock and noalloc are represented as parts of a canonical function type, or shadow types, discussed later in this document. ↩︎
See “Redeclarations and Overloads” in the Clang CFE Internals Manual. ↩︎
If nolock/noalloc were promoted to full language features like noexcept, it would make sense for both to imply noexcept. But it would be incorrect for an attribute to imply a language feature. ↩︎
Note, however, that in the absence of a diagnostic, it is very likely that in existing code, there are noexcept functions which make potentially throwing calls. ↩︎
This is a common messaging primitive. ↩︎
TODO: Make a comprehensive list. ↩︎

erichkeane · February 7, 2024, 3:50pm

So I have quite a few concerns with this, the first being why the TSA attributes aren’t good enough/why we cannot use/extend them instead for this purpose?

But most importantly, making this a part of the function type is quite expensive, and frankly, adding any more bytes to the FunctionType itself is a non-starter for me. It’ll end up limiting our instantiation depth/already bad memory pressure.

The nolock and noalloc attributes, like noexcept, do not factor into argument-dependent lookup and overloaded functions/methods.

What do you mean by ADL here? Noexcept participates in deduction, but I don’t know what it means by ADL. Also, as you mention, C++17 makes noexcept part of the overload set: do you intend this differentiation as well?

The analysis to verify nolock/noalloc functions happens at the end of Clang’s Sema pass, in AnalysisBasedWarnings.

This is a ‘red flag’ to me, particularly since it exists in the type system. Analysis based warnings are intrinsically imperfect, so unless this is something that can be implemented via type-conversion/type rules, this is a real problem and shows design issues that I don’t think I’d want us to take on.

AaronBallman · February 7, 2024, 3:55pm

Thank you for the detailed RFC! I’m still reasoning about the contents, but the first thing that jumped out at me is that this sounds very closely aligned with functionality we already have, which is capability-based analyses (thread safety analysis). The basic idea for that functionality is that you can define specific capabilities, and then certain functions can either acquire or release those capabilities and other functions can have requirements that certain capabilities be held.

It sounds to me like you’ve effectively got a “realtime” role where you want some code paths to be restricted to only be allowed to call other realtime-capable functions. Have you considered using the existing capability analysis functionality (possibly with improvements specific to your needs)?

dougsonos · February 7, 2024, 6:39pm

Thank you for the detailed RFC! I’m still reasoning about the contents, but the first thing that jumped out at me is that this sounds very closely aligned with functionality we already have, which is capability-based analyses (thread safety analysis). The basic idea for that functionality is that you can define specific capabilities, and then certain functions can either acquire or release those capabilities and other functions can have requirements that certain capabilities be held.

It sounds to me like you’ve effectively got a “realtime” role where you want some code paths to be restricted to only be allowed to call other realtime-capable functions. Have you considered using the existing capability analysis functionality (possibly with improvements specific to your needs)?

Thanks for reading and the quick feedback! Your understanding of the motivation is exactly correct - we want to statically verify that on realtime threads, we only run code that is realtime-safe.

I did look at Thread Safety Analysis. As far as I can tell, it asserts the acquisition of capabilities, where nolock and noalloc want to assert a near-complete lack of capabilities. There is a section about (experimental) negative requirements, but it isn’t clear to me how to express a requirement concerning all possible capabilities (including memory allocation) which must not be acquired.

Also, the use of function pointers and virtual methods is pervasive in our world, thus the desirability of making the “anti-capability” part of a function type (at least as sugar if not the canonical type). The thread-safety attributes appear to be attached to declarations.

If you have ideas about how to reconcile these differences, I’d be happy to explore them.

I do see a parallel to the enforce_tcb attribute Attributes in Clang — Clang 19.0.0git documentation, although it too applies to declarations and cannot analyze indirect calls. If TCB could be represented in the type system (as sugar), that could be a viable approach, though it would require special-casing to make nolock a stronger form of noalloc, and there could be issues with existing code using TCB’s (indirect calls would get diagnosed where they aren’t currently).

dougsonos · February 7, 2024, 6:43pm

Thanks for reading and the quick feedback.

So I have quite a few concerns with this, the first being why the TSA attributes aren’t good enough/why we cannot use/extend them instead for this purpose?

Please see my reply to Aaron Ballman, who asked the same thing.

But most importantly, making this a part of the function type is quite expensive, and frankly, adding any more bytes to the FunctionType itself is a non-starter for me. It’ll end up limiting our instantiation depth/already bad memory pressure.

One prototype implementation uses AttributedType to represent the attributes as sugar.

Another prototype implementation uses two bits of FunctionTypeExtraBitfields, which is not enough to increase its footprint.

The nolock and noalloc attributes, like noexcept, do not factor into argument-dependent lookup and overloaded functions/methods.

What do you mean by ADL here? Noexcept participates in deduction, but I don’t know what it means by ADL.

Rereading this now, honestly, I’m not sure of what I meant either. I’ll strike the mention of ADL from the next draft unless someone has a better idea.

Also, as you mention, C++17 makes noexcept part of the overload set: do you intend this differentiation as well?

The draft says no, “since the attribute is not part of the canonical function pointer type”, but with a footnote about how, currently, it’s open question of whether to represent the attributes as part of the canonical type (which would make them part of the overload set) vs. type sugar representation (which would not).

The analysis to verify nolock/noalloc functions happens at the end of Clang’s Sema pass, in AnalysisBasedWarnings.

This is a ‘red flag’ to me, particularly since it exists in the type system. Analysis based warnings are intrinsically imperfect, so unless this is something that can be implemented via type-conversion/type rules, this is a real problem and shows design issues that I don’t think I’d want us to take on.

Much of the analysis does happen when checking type conversions.

The call-chain analysis is currently in AnalysisBasedWarnings because that was the early advice of someone with experience in that particular area. Other feedback has suggested that other approaches are possible.

rjmccall · February 7, 2024, 7:04pm

I think you might be over-interpreting what it means to be an analysis-based warning. As I understand it, there isn’t any sort of complex, control/data-flow-aware analysis here; it’s a straightforward, conservatively-correct occurs check that recurses into specific kinds of calls. It’s imperfect in essentially the same sense that the typing rule for the ternary operator is imperfect — it has a simple rule that it consistently enforces even if a more sophisticated analysis could preserve more information.

erichkeane · February 7, 2024, 7:07pm

My concern with the analysis based warning is that it immediately calls into question how well the diagnostic works cross-TU. An attribute like this that introduces an ‘imperfect’ diagnostic is, IMO, not particularly valuable as it cant make the guarantees it wishes to.

I’m curious what you mean about the ‘ternary’ operator, the type of the ternary operator is well defined and in the standard, so we can diagnose mismatches immediately.

dougsonos · February 7, 2024, 7:17pm

The verification of a function holding a performance constraint attribute is simple and reliable:

an indirect callee’s type must have the attribute
a direct callee must either have the attribute, or as a convenience when located in the same TU, must be inferably safe.

The ability to make inferences about code in the same TU is a convenience to obviate manual annotation. (In the case of templated code, in full generality, a manual annotation would have to be an expression based on all of the functions/function types it calls.)

rjmccall · February 7, 2024, 8:59pm

Yes, and the language rule specified by the standard is an over-conservative approximation of the best rule. For example, the standard could say that, if the condition is a constant expression, the type of the operator is always the type of the appropriate operand, and the other operand has no effect. I’m not saying that would necessarily be a better rule; I’m just pointing out that imperfect information is part of all static analysis, including type systems, and that we shouldn’t consider that by itself to be a blocker.

That is, the question should not be whether the analysis is confronted with imperfect information, because everything is. I think the right questions are these:

Is the analysis conservatively correct, or is it just a best effort to emit warnings that it guesses are relevant?
Is the analysis well-specified and stable, such that a reasonable programmer should be able to correctly predict its behavior, rely on it, and understand why a certain code change has triggered a diagnostic?
Do the limitations on the analysis cause significant pain for developers trying to use it? Would further attempts to improve the analysis require changing the answer to the previous two questions?

erichkeane · February 7, 2024, 9:44pm

Ah, sure, the ‘ternary’ operator is perhaps a bit ‘not the best possible’, but it does so because language design principles require that these sorts of things be encoded in the type system/be checkable at the time of declaration/instantiation.

Having any language-type definition that requires more analysis than that is not conducive to a particularly effective compiler. Other languages where that is tried end up having ‘imperfect’ diagnostics (or, a ‘fatter’ module-type-system than TUs), and pathologically difficult diagnostics.

In this case, this language extension seems to be one where the ‘imperfect’ diagnostic is one that makes the feature really tough to recommend use, which then makes me wonder the value of having it at all.

So THAT is why it is a red-flag for me.

I believe your list of 3 questions correctly approximates my concerns.

rjmccall · February 7, 2024, 10:11pm

Okay. So to spell it out, you’re concerned that the conservativeness of the analysis about calls it can’t see through will make it unworkable for programmers to adopt, presumably because adding the attribute everywhere will be too cumbersome. That seems like a reasonable concern. Doug, I assume you have a prototype and some prospective adopters; can you speak to this? Are there examples of the kind of realtime code that people typically write that we can look at and see how burdensome the annotations are?

dougsonos · February 8, 2024, 12:47am

Over the course of a few weeks I fully annotated a large project:
several thousand source files
over a million lines of code
required ~2200 uses of the nolock attribute

At first, this project drove refinement of how the compiler reported diagnostics, but then the prototype compiler guided the remaining adoption effort quite smoothly. As you might expect, the main points of friction were the bona fide realtime safety issues (incorrectly using mutexes, logging, exceptions etc.). Some were trivially fixable so I addressed them. Others required more effort; I filed bug reports about those and disabled warnings around the problematic constructs.

I experimented with disabling inference and that was discouraging, it felt like that would increase the size of the adoption effort by 2-10x (even aside from template difficulties with the STL).

This project uses an open-source library, AudioUnitSDK. The class AUBase is foundational and is typical of a pattern that we use pervasively – the constructor, destructor, and various setup methods (GetProperty, SetProperty, Initialize, Cleanup) are expected to allocate memory and potentially block. Then there are some methods which are documented to be safe to call in realtime contexts: DoRender, DoProcess, DoProcessMultiple, GetParameter, SetParameter are the entry points.

In adopting nolock, I first added the attribute to the realtime entry points and let the compiler guide me to find:

virtual methods which need annotation (e.g. Render called from DoRender)
non-virtual non-inline methods needing annotation
bona fide safety errors (in this project, a lot of use of exceptions)

I hope to be able to push my branch soon, but perhaps just browsing the source will provide a flavor. This project’s statistics (on my branch) are:
36 files
9,982 lines of code
125 uses of the nolock attribute

So roughly ~1% of the lines of code were touched, which doesn’t seem onerous in code whose primary function is to process audio in realtime. A typical subclass of AUBase, depending on its complexity and use of utilities in separate translation units, needs to annotate maybe 2-20 methods.

dougsonos · February 8, 2024, 7:20pm

I made a branch of AudioUnitSDK which illustrates adoption of the clang::nolock attribute. The attribute is attached to the realtime entry points and everything they call that isn’t inferably safe. The diagnostics are disabled across some unsafe constructs (mostly throwing and catching exceptions), with TODO comments. It builds with zero warnings.

This took about an hour of work. It identifies ways to improve the realtime-safety of the project. The new warnings discourage the introduction of new unsafe constructs.

rjmccall · February 12, 2024, 10:24pm

Thanks, Doug. Can you share how the maintainers feel about this, if you know? Like, are they looking forward to getting this additional static checking, and they think the annotations are totally acceptable?

Do we think the feature’s at a reasonable limit for inference, or is this something where we could reasonably do more over time to reduce the annotation burden?

dougsonos · February 12, 2024, 11:13pm

For full disclosure here, I am one of the maintainers of AudioUnitSDK, and it while it’s an open-source project, it plays an important role in my team’s non-open-source work.

Yes, we are very keen to get the additional static checking. As the RFC mentions, adoption work with the prototype has exposed many long-standing issues – and it is very often revealing issues in newly-written code, ranging from rookie mistakes to inadvertent slips by the most experienced of us.

Yes, for us the annotation burden is quite modest and acceptable, especially in new code. The diagnostics tell you what to do. Fixits would make it even simpler. In one situation, a Python script using regexes was a useful way to automate the process a bit.

The only ways I could think of making inference more powerful would require much larger changes than contemplated by this feature, e.g. how to make attributes survive decomposition of a function type into std::function; how to diagnose a program as a whole rather than as a collection of isolated TU’s.

Thanks, John.

rjmccall · February 12, 2024, 11:26pm

That’s great context, thanks.

@erichkeane, @AaronBallman, has this helped address your concerns?

pinskia · February 13, 2024, 2:23am

May I suggest to delete this part:

The GNU __attribute__((nolock)) syntax is also supported.

Or add a clang_ prefix to it.
Unless you are suggest adding it to GCC too. This way there is no conflicts if GCC adds a nolock that means something (if at all slightly) different. This has happened recently with the assume attribute. Where GCC’s GNU style assume is the almost the same as C23/C++23’s assume attibute and takes an expression unlike clang’s assume attribute which takes a string.

erichkeane · February 13, 2024, 2:25pm

Helps a bit, but doesn’t entirely assuage unfortunately. It still sounds like it is going to be an imperfect diagnostic, which we are doing our best to shy away from in the CFE. I find myself wondering if this would be better served being in ‘tidy’ or something.

The more I think of it, the more heartache I have about the argument as well: It requires more storage on the type (an Expr pointer for each), and I don’t see much value added to it.

Additionally, I still haven’t seen why this isn’t just re-implementing a lot of the TSA work, I’m afraid we’re inventing something ‘new’ rather than helping the TSA accomplish a task that it effectively was designed to do.

AaronBallman · February 13, 2024, 2:36pm

CC @aaronpuchert for more opinions here, but my thinking was that anything that can be expressed as a negative capability should be something that can be expressed differently as a positive capability. e.g., locking functions can call realtime functions but not vice-versa, so you have a “realtime” capability that is acquired on the entrypoint to the call stack that needs to happen in realtime, and released on exit from that call stack. Functions marked as requiring the “realtime” capability can then be called on that call stack, but any function without the realtime requirement cannot be called.

Ah, this is true, everything here is attached to the declaration and not to the type. I’ve wondered in the past whether the capability analysis functionality should be a property of a type instead, but I think that would be a pretty significant change and might be a show-stopper for your needs.

AaronBallman · February 13, 2024, 2:43pm

Somewhat, yes!

When looking at capability analysis, we ran into this same sort of thing. Inference can save you from adding a lot of annotations in source. The downside is that inference can also be a bit mysterious because of its nature (AST dumping can help you see where things have been inferred but that’s a pretty big usability cliff; then again, folks working in realtime spaces may not find it as big of a cliff as others).

FWIW, I had two folks show up to my office hours yesterday to discuss the idea of a sanitizer-based approach to solving this exact problem. They were unaware of this RFC when I mentioned it and so I’d like to give them some time to consider it and respond before we sign off on this RFC. There may be room for both an analysis-based approach and a sanitizer-based approach, but it may also be that we only need one approach – it really depends on the tradeoffs. A sanitizer will have more false negatives while an analysis pass may have more false positives, and it’s not clear to me which is better or whether we need both.

Topic		Replies	Views
Proposal for thread safety attributes for Clang Clang Frontend	19	85	July 28, 2011
Thread Safety Analysis Annotations Using Clang	0	76	January 8, 2017
attributes on function arguments Clang Frontend	2	83	May 31, 2009
Clang Thread Safety Annotations Current Status. Clang Frontend	4	85	July 12, 2013
support for annotations in clang Clang Frontend	1	94	November 25, 2007