RFC: Enforcing Bounds Safety in C (-fbounds-safety)

Thanks @nickdesaulniers for your valuable feedback!

Yes, all the annotations we discussed in the context of function parameters are also available in structures. Typically, even incremental adoption tends to involve more annotations for function parameters because functions vastly outnumber structures in most programs. That said, it is generally required to annotate both.

That only refers to __counted_by. One of the purposes of __sized_by is to serve that role for void * . Additionally, void *__single casts to any other __single pointer.

There is a bounds check (with a trap on the failure path) at all locations assigning a new value to a __counted_by, __sized_by or __ended_by pair. Given an example int foo[10] , the start (Start ) of an __ended_by pair can be any value inclusively between &foo[0] and &foo[10] (one past the end), and the end can be any value inclusively between Start and &foo[10] (also one past the end). Of course, if both are the same (such as {&foo[10], &foo[10]} ), no elements can be dereferenced.

-fbounds-safety essentially inserts run-time checks and traps to maintain the invariant, while we strive to include as many compile-time diagnostics as possible to identify potential run-time traps early on. So whether a run-time trap will be reported at compile time depends on the codebase and the implementation. If the compiler knows if an operation will always lead to a trap at run time, it reports an error. An example of this is the assignment of a __counted_by pointer with a __single pointer, and a count variable with a constant value greater than 1. The compiler can report an error in this case because it knows this will result in a run-time trap. Moreover, as we become aware of certain patterns likely to cause a run-time trap, we incrementally incorporate compiler warnings.

Generally, yes, but it really depends on whether the optimizer can identify these checks to be redundant since -fbounds-safety relies on the LLVM optimizer to optimize redundant checks. For instance, in the example below, the compiler may or may not remove the redundant null check, depending, for example, on whether it’s beneficial to keep the result of the null check in the register while performing other operations (assuming these don’t update the local variable p). The same applies to null checks inserted by -fbounds-safety.

void foo(int *p) {
    if (!p) {
        // do something
    } 
    // ...
    // some other operations
    // ...
    if (!p) {
        // do something
    }  
}

-fno-delete-null-pointer-checks doesn’t interfere with the optimizer removing redundant checks. This flag is all about treating null pointer dereferences as non-UB. As such, it prevents the compiler from removing null checks solely because they occur after the pointer dereference. However, the compiler can still eliminate redundant checks.

Another possible point of interaction could be the behavior related to the nonnull attribute because -fno-delete-null-pointer-checks strips nonnull from function parameters. However, this doesn’t interfere with -fbounds-safety because whether -fno-delete-null-pointer-checks is enabled or not, -fbounds-safety reports an error if a pointer has both __counted_by_or_null and the nonnull attribute. This is consistent with the compiler implementation that maintains nonnull -related warnings even when -fno-delete-null-pointer-checks is enabled.

This only applies to “outermost” pointers of local variables. All nested pointers and array element types are considered as ABI-visible so they are __single by default. The programmers need to explicitly add __bidi_indexable or __indexable in order to make them wide pointers.

Could you clarify what you mean by “the non-ABI changes” and using __counted_by/__sized_by without -fbounds-safety ? Are you referring to merely integrating some bounds checks when possible (i.e., emitting checks on a pointer dereference if it has __counted_by)? If that’s the case, while that would be out of scope of our work, I can imagine such a task could be accomplished by extending sanitizers, relevant builtins such as __builtin_dynamic_object_size , and/or compiler warnings to utilize the attributes once we provide the parsing logic.

But if you mean more comprehensive bounds safety support, it becomes quite complicated because -fbounds-safety is about the language model and the type system to ensure that all pointers always have the correct bounds information and all pointers are secured by default. For this to function properly, the “ABI changing behaviors” - the aspect that transforms local variables into implicitly wide pointers, is critical for propagating the necessary bounds information without the need for manual bounds annotations.

Yes. In case the pointer is involved in pointer arithmetic, it needs a separate lower bound to perform the necessary bounds check, as shown in the example below:

int foo(int *__counted_by(size) buf, int size, int offs) {
    int *local = buf; // implicitly bidi_indexable
    local += offs;
    return *local; // checks both the lower and upper bounds
}

Thanks for bringing this up! We’d love to learn more about the ways to improve the portability.

Thanks @jrtc27! Sounds like using __has_feature wouldn’t cause the portability issue with GCC anymore?

We are planning to release the preview in Apple’s fork of llvm-project in near future.

Oh interesting. Is there a reason why your memcpy and string.h friends have the overloadable attributes?

To clarify the phrase “not participating in function overloading”, I meant that functions cannot be overloaded based solely on differences in __counted_by and similar attributes, just as they can’t be overloaded based on any other type attributes. So you still only need to provide a single definition for these functions.

Yes, this should work with -ftrap-function= . We have been using it to log the traps and continue the execution. Do you have any other use cases in mind?

Right. We only briefly mention __counted_by can be used to annotate incomplete arrays and the arrays decay into a pointer with the size. We will add a detailed programming model for flexible array members.

__counted_by is conceptually a type qualifier, so we’ve decided to place it inside array brackets, similar to other type qualifiers such as restrict and static which can also be put inside array brackets. There is already a pathway in Clang to parse qualifiers (and attributes) inside array brackets so it didn’t require us too much additional work to allow the parsing of __counted_by .

We don’t have a concrete ETA yet but the plan is to start working on the initial patch after we hear about WG14’s position on proposals that may affect the late-parsing related behavior (@AaronBallman mentioned it’s expected in next week). And the initial patch is going to include the implementation of the parsing logic for __counted_by under another experimental flag , without -fbounds-safety , for attribute-only use cases. This is so that other relevant features in Clang can begin experimenting with the attributes. I envision that @Kees’s work on using a count attribute with flexible array members for __builtin_get_dynamic_object_size() could be one of the first clients. (I thought I saw @Kees’s RFC in the discourse but I couldn’t find it anymore).

1 Like