RFC: Enforcing Bounds Safety in C (-fbounds-safety)

To be clear, I was referring to a divergence in the example below where in C++, guarded_by(m) will use the member m , whereas in C, the same code will use the global m .

Mutex m;

struct __attribute__((locked)) guared_int {
    // the parser picks up the member `m` instead of the global variable
    int i __attribute__((guarded_by(m));
    Mutex m;
};

For what it’s worth, offsetof(struct foo, member) is an example where a naked identifier is used to refer to a struct member in C.

Yes, I mean this is also the case for C++. The confusion lies in the structs.

Oh, I did not know in GCC, ‘var’ already refers to a global or automatic variable in struct for C. Could you provide a code example?

Right, for this, -fbounds-safety allows restrictive operations only inside __counted_by , i.e., arithmetic operations without any side effect. Though a constant function call will be permitted. The same applies to parameters.

Thanks for sharing your insight. Do you have a sense of how difficult would it be to implement in GCC, fixing the types of n and j in the expression .n* .j*2+10 at the end of struct parsing? Or emitting an error later if n or j doesn’t exist?

Am Montag, dem 24.07.2023 um 23:30 +0000 schrieb Yeoul Na via LLVM Discussion Forums:

rapidsna
July 24

uecker:

It would not create diverging syntax, it would not mereley be not the most idiomatic
way for C++.

To be clear, I was referring to a divergence in the example below where in C++, guarded_by(m) will use the member m , whereas in C, the same code will use the global m .

Mutex m;

struct __attribute__((locked)) guared_int {
    // the parser picks up the member `m` instead of the global variable
    int i __attribute__((guarded_by(m));
    Mutex m;
};

Yes, it is very unfortunatey that somebody added this without considering
C language rules.

I do not suggest to changes it if it has existing users, but I would also not recommend to
create more problems. But this example also shows that it might be difficult to fix such
problems later… or if it is still possibly to fix for guarded_by (by adding the new syntax
and deprecating the old) we should consider this.

In any case, I think an incompatibility between “guarded_by” and “counted_by”
is more acceptable than an incompatibilty between C and C++ or an incompatibility
between VM-types in C and “counted_by” in C.

uecker:

In contrast, in C the members of a struct are in a different name space,
so it would be consistent for ‘var’ to refer to a global or automatic variable.

For what it’s worth, offsetof(struct foo, member) is an example where a naked identifier is used to refer to a struct member in C.

Yes, but this is an ancient macro and there exist no legitimate reason for the second
argument to refer to something else, so no inherent ambiguity.

uecker:

Note that it already does in other context, such as

int n;
void foo(int buf[n]);

Yes, I mean this is also the case for C++. The confusion lies in the structs.

uecker:

and on GCC the same is true for structs. So for C, this would introduce fundamentally new scoping rules that are inconsistent with the rest of the language.

Oh, I did not know in GCC, ‘var’ already refers to a global or automatic variable in struct for C. Could you provide a code example?

GCC supports the following:

{
int n;
struct foo { char (*buf)[n]; int n; } x;
}

https://godbolt.org/z/senq1KcYE
(the warning for the second example will go away with full C23 support)

Martin

Am Montag, dem 24.07.2023 um 23:55 +0000 schrieb Yeoul Na via LLVM
Discussion Forums:

uecker:
Yes, these are exactly the problems with late parsing I want to
avoid.
Currently, for C, there is no requirement to first construct an AST
which is then walked again to do type checking and constant
folding.
So often compilers (and also many other tools!) do not create an
AST. For those compilers/tools, it would require a substantial
rewrite of the FE if we started to require this. (it also
affects compile / analysis time to build and walk a tree structure
and incremental processing / analysis of source code becomes much
more difficult)
The problems could be mitigated here by either fixing the type
during
parsing somehow (as suggested above) or by limiting the expressions
to rely simply forms which do not require infrastructure for a full
AST.

Thanks for sharing your insight. Do you have a sense of how difficult
would it be to implement in GCC, fixing the types of n and j in the
expression .n* .j*2+10 at the end of struct parsing? Or emitting an
error later if n or j doesn’t exist?

If you restrict it to simply expression such as .n * X + Y it would
be doable with some work, I guess. GCC has a AST-like tree structure
which could be used to store such expressions. But I am not sure
I would have the time to do it. But it is far easier to implement
this when the types are known / fixed during parsing. In fact,
I have an experimental patch for this already for a while (but
assuming ‘int’ which is also not a full solution).

But also GCC never builds a full AST, and parsing, type checking, and
folding are interleaved. So for arbitrary expressions (which we need
to support for VM types) it would also require major changes. And
then there is not only GCC. We also have many smaller C compilers
(both open-source and proprietary) and other tools, which sometimes
already have difficulties to keep up with the new features we have
in a new C version. This is a different world than C++ where only
a few major compiler frameworks exists which have enough resources
to implement even complex changes.

BTW: There is another example in C where such a language design
mistake already causes problems in GCC: The “default” branch in
a _Generic expression does not have to be the last, so you can
have:

_Generic(…, default: …, int: …);

This does not cause severe problems as late parsing would do
because you can still do parsing, type checking, and folding,
but a less severe version of this problems also appears here:
One does not know which of the branches is the active one
during parsing, and this would be helpful when producing
diagnostics, i.e. to have certain warnings only in the
active branch. This is not possible to implement in GCC,
so we currently do have worse diagnostics than possible.

So language design mistakes do have a cost in terms of
implementation effort, result in unimplemented features
or missing diagnostics, or might even cause us to lose
smaller compilers / tools.

Martin

Well, it’s not just about guarded_by . In C++, identifiers inside a class/struct generally refer to members. So, I don’t think it’s subject to change for C++. Here is another example: Compiler Explorer.

int n = 1;
struct f {
    int m {n}; // `n` is `f::n`, not the global `n`. 
    int n;
};

But again, this is not limited to these examples; it’s a common convention in C++ to omit this-> for referring to a member.

This means if C++ supports VLA, n inside the struct would be expected to be f::n in this example:

int n;
struct f {
    int (*m)[n]; // `n` is `f::n`
    int n;
};

An even more confusing case would be when the member n comes first in the struct, as shown in the following example. If C++ supports it, n should definitely be f::n.

int n;
struct f {
    int n;
    int (*m)[n]; // `n` is `f::n`
};

Hmm, what if the member n comes before buf as shown in the following example?

{
int n;
struct foo { int n; char (*buf)[n]; } x;
}

It sounds like n would still be the variable outside the struct? If so, this seems to generate a confusion for C++ if it were to support this. I believe the behavior is actually confusing for many C programmers as well.

To avoid the confusion and divergence, it would make sense to consider using a new syntax to refer to a non-member or emitting diagnostics for name conflicts.

Am Dienstag, dem 25.07.2023 um 20:38 +0000 schrieb Yeoul Na via LLVM
Discussion Forums:

rapidsna
July 25

uecker:
Yes, it is very unfortunatey that somebody added this without
considering
C language rules.
I do not suggest to changes it if it has existing users, but I
would also not recommend to
create more problems. But this example also shows that it might be
difficult to fix such
problems later… or if it is still possibly to fix for guarded_by
(by adding the new syntax
and deprecating the old) we should consider this.
In any case, I think an incompatibility between “guarded_by” and
“counted_by”
is more acceptable than an incompatibilty between C and C++ or an
incompatibility
between VM-types in C and “counted_by” in C.

Well, it’s not just about guarded_by . In C++, identifiers inside a
class/struct generally refer to members.

I see, but then this is caused by a fundamental incompatibility
between C and C++ and not really related to the new feature at all.

Sorry, I did not realize this.

Then the only way out seems to be to use the a new syntax
if we need something for shared headers.

So, I don’t think it’s subject to change for C++. Here is another
example: Compiler Explorer.
int n = 1;
struct f {
int m {n}; // n is f::n, not the global n.
int n;
};

uecker:
GCC supports the following:
{
int n;
struct foo { char (*buf)[n]; int n; } x;
}
Compiler Explorer
(the warning for the second example will go away with full C23
support)

Hmm, what if the member n comes before buf as shown in the following
example?
{
int n;
struct foo { int n; char (*buf)[n]; } x;
}

This does not matter. In C the members of a struct exist in an
entirely different name space than regular identifiers. So
a C parser would either look up in the member namespace in
a construct like x.n or x->n or it would look it up in the
regular name space and then it refers to the variable n.

So adding ‘.’ for disambiguation would work for C
because the lookup can then happen in the member name
space. This is also what happens for designators

int n;
struct foo { int n; } = { .n = 1 };

where .n refers to the member and in

struct foo { int n; } = { n };

it would refer to the variable.

It sounds like n would still be the variable outside the struct? If
so, this seems to generate a confusion for C++ if it were to support
this. I believe the behavior is actually confusing for many C
programmers as well.
To avoid the confusion and divergence, it would make sense to
consider using a new syntax to refer to a non-member or emitting
diagnostics for name conflicts.

I think we should use a new syntax.

A warning would only work in one direction, i.e. when creating
a naming collision, but not if you accidentally refer to the
wrong one due to a typo.

Martin

For the implementation of __counted_by, I just want to clarify syntax for it with flexible array members: we need to avoid putting the attribute inside the [], as this makes it inconsistent with applying it to pointer members, function arguments, etc. GCC’s __counted_by attribute is close to landing, and takes the form:

struct foo {
    int a, b, c;
    int elements;
    struct bar data[] __counted_by(elements);
};

Besides consistency, having the attribute outside the [] means no changes are needed in other tooling (e.g. Coccinelle, static analyzers, etc) that already handle struct member attributes before the semi-colon, etc.

Also, I just want to reiterate that __builtin_dynamic_object_size() still needs to be wired up, otherwise the existing bounds-checking systems present in many libc implementations (and the Linux kernel), i.e. FORTIFY_SOURCE, will not gain any coverage from __counted_by, which rather defeats the purpose. :slight_smile: For example, the majority of the recent buffer overflows in the Linux kernel have been reached via memcpy() which can only be protected by FORTIFY_SOURCE, and not -fsanitize=array-bounds (or similar).

1 Like

In the -fbounds-safety model, __counted_by is a type attribute, not a declaration attribute. It cannot be a declaration attribute because taking the address of a __counted_by thing has to produce a pointer to a __counted_by thing. Considering the broader picture of -fbounds-safety, it would be a mistake to put __counted_by outside the array brackets.

Consistency depends on your viewpoint: if you think of __counted_by as a declaration attribute, then it makes sense to specify it like a declaration attribute. However, __counted_by must be part of the array type to fully uphold its contract. In that light, consistency requires __counted_by to be inside array brackets as this is where other array modifiers, like static, nullability attributes, or an actual count, would go. Putting it elsewhere would also create unnecessary ambiguities with multi-dimensional arrays.

Defining __counted_by(N) to nothing and reverting to plain C behavior when the attribute is not recognized is the better compatibility route for tools that do not understand it.

I also want to acknowledge FORTIFY_SOURCE–we’re not forgetting and we want to talk about it, I’m just not the right person to do it.

I’d want to use __counted_by on non-array struct members, though:

struct foo {
    ...
    unsigned short tx_size;
    unsigned short rx_size;
    char *tx __counted_by(tx_size);
    char *rx __counted_by(rx_size);
};

Nothing about it should be unique to arrays, but I do see what you mean about ambiguity for multi-dimensional arrays. Perhaps both positions can be legal?

As it turns out, our implementation does accept __counted_by after the brackets like in your previous comment, so the point might be moot.

In the general case, we consider it a non-goal to designate a single location on a line where __counted_by should go. Like with const , each level of indirection can have its own, so there have to be syntactically distinct ways to apply it. This has come up before in the thread. We primarily use __counted_by on pointers exactly like in your example, and while our adopters have usually gone with char *__counted_by(tx_size) tx , we also accept __counted_by after the identifier. The caveat is that it only works to annotate the outermost pointer. For instance, it wouldn’t work here:

void foo(int *__counted_by(*size) *q, size_t *size) {
    static int buf[10];
    *q = buf;
    *size = 10;
}

void call_foo(void) {
    size_t size;
    int *__counted_by(size) p;
    foo(&p, &size);
}

You can’t have __counted_by after q in this case because the count doesn’t apply to the outermost pointer. This is where __counted_by being a type attribute becomes significant.

Hurray! This will let me avoid quite a bit of pain for application of __counted_by to the Linux kernel. :slight_smile: Thanks for double-checking!

Got it – thanks for the example. I see how it’s like “const” in similar pointer constructions.

3 Likes

We agree that __builtin_object_size /__builtin_dynamic_object_size should be wired up with count attributes, but it should be the case with or without -fbounds-safety. We are planning to expose the parsing logic of the attribute so it can be used to implement the object_size builtins and related warnings and sanitizers. There is no precedent in our code bases to combine __counted_by with object sizes, so we’re happy to approach this conversation with a clean slate. The desired behavior may be less obvious than it sounds because the models are a little different.

The object_size builtins try their best to infer the size of objects (potentially walking through variable assignments and function inlining) and also take an argument (type ) to select which sub object to use and whether to compute the min/max when multiple different objects could be passed to the builtin.

With the __counted_by attribute the size would come directly from the type on the pointer/array argument to the builtin. This size is not necessarily the real size of the object (it could be less). Also given that the size comes directly from the pointer/array argument there isn’t really a notion of sub-object or a notion of multiple objects to choose from.

To illustrate some of these differences consider the code below:

// Without `__counted_by`, the compiler can infer that `p`
// points to `16` ints. So this holds:
//
// __builtin_object_size(p, 0) == sizeof(array)
//
//
int array[16];
int *p = array;
printf("%zu", __builtin_object_size(p, 0));

// With __counted_by(8) if __builtin_object_size
// support was implemented, `p` would be treated as
// pointing to `8` ints. So this would hold:
//
// __builtin_object_size(p, 0) == 8 
//
int array[16];
int *__counted_by(8) p = array;
printf("%zu", __builtin_object_size(p, 0));

[Sorry for the late reply.]

We did implement -fstrict-flex-arrays=3. It accepts only incomplete arrays as flexible arrays. So that’s done at least. :slight_smile:

As for the __builtin_dynamic_object_size interaction, I have a patch in review that will do this for a struct with a flexible array member. See https://reviews.llvm.org/D148381.

This is a sticky wicket for sure. A major issues comes into play when trying to specify a field in a separate nested struct:

struct S {
  struct T {
    int count;
  } t;
  struct U {
    struct V {
      double * __counted_by(count) flexible_array[];
    } v;
  } u;
} s;

Is the count in __counted_by going to find the correct count in struct T? What happens if we pass u or v into a function, and we no longer have access to count? A stupid suggestion (by me) would be to use the UNIX syntax for changing directories: __counted_by(.., .., t, count). Bask in its grossness!!

I’m (surprising to me, because I thought I’d hate it) in favor of GCC’s forward declaration semantics. It’s use could be severely restricted so that it will avoid some of the issues @AaronBallman mentioned, but I assume that GCC does that already.

If not that, then we have a few Hard Truths™ to realize with C. We are working with a very old language, which was never optimal for security features. It was developed back in a time when computer usage was time consuming and expensive, so a single-pass parser was really the only feasible way to implement a compiler. We are now way beyond those days, even though we still want compilations to go quickly.

That said, if we really want to push C towards a more (let’s be honest) Rust-like language, we have only a few options available to us:

  1. Contort the feature and language so that it still has single-pass parsing and will do everything we want (e.g. making the __counted_by et al keywords part of the type so that it’s hard to ignore), or
  2. Realize “we’re no longer in Kansas” and start violating the restrictions they faced in 1970 (the year Samara was born!).

Clang only embraces #2 when it’s an obvious win for us, and we by no means do this lightly, nor do we want the world to change just for us. But just like GNU’s inline asm is closely tied to its register allocator, and thus significantly harder / impossible for other compilers to implement all of its features, sometimes the benefits of a feature outweigh the hardships it imposes on others.

In a slightly less contentious statement, K&R C had a “feature” where not specifying the type of a variable automatically assigned it the type int. So this was legal:

int foo()
{
  a, b, c, d;
  /* some code */
  return a + b - c * d;
}

Perhaps we could do that in this specific case. If we haven’t encountered n before use, we assume it’s an int and then carry on our merry way with the single-pass parse. This does require that the n must have the type int once its encountered, of course.

To avoid situations like that, -fbounds-safety doesn’t allow referring to count in the other struct. In -fbounds-safety, the __counted_by argument can only refer to another member of the same struct. Referring to a member of another struct in __counted_by, including nested structs (or the base struct) is not allowed.

In general, we are restricting the use of other struct members as much as possible because it makes it very difficult (if not impossible) to ensure that the __counted_by pointer and the count values are synchronized, meaning the pointer always points to the memory of the size indicated by the count value.

In the following example, bp, which is a pointer to Base , is used to update n . This could potentially break the correctness of the bounds information if the dynamic type of bp is Derived (or aliased with a pointer to Derived ) but Sema won’t be able to identify this.

struct Base {
  int n;
};
struct Derived {
  struct Base header;
  // Not allowed: Counted by a member of another struct member
  int * __counted_by(header.n) ptr;
};

void foo(struct Base *bp, struct Derived *dp) {
    // Potentially unsafe and compiler doesn't know if `bp` and `dp` alias 
    bp->n++;
}

That said, we’re open to extend the model to allow referring to a member of struct from anonymous nested struct to support use cases in the Linux kernel like in the following example:

struct parent {
  int n;
  struct {
    int dummy;
    int fam[__counted_by(n)];
  };
};

For more information, here are some conversations on the restrictions that -fbounds-safety have on __counted_by arguments. RFC: Enforcing Bounds Safety in C (-fbounds-safety) - #66 by rapidsna

I could understand 2 it weren’t so dead simply to implement to implement a solution that works with forward parsing. The other argument is not just technical: Forward parsing rules also help humans to read the code more easily. This is also security relevant.

Finally, it is good language design to keep things consistent which makes it also easier teach, e.g. you do not have to explain scoping rules and then add each time “but in 2023 clang people added a special rule and therefor in this context it works differently and you have to pay special attention to it that you get it right”.

The “haven’t encountered” is problematic because it breaks as soon as some unrelated name (which could come from some distant code) comes into scope.
But for typing, I thought about this too. One could use special syntax (such as ‘.n’) and then say the .n simple has type ‘int’. This works for all ‘n’ that then actually have a type that promotes to ‘int’. But I guess you also want at least size_t and then this does not work. One could require an explicit cast in this case (size_t).n …

void f1(int *ptr, size_t __counts(ptr) N);

I don’t know whether this is the right way forward, but it’s genius.