Introduction
The Bounds Safety feature is an exciting new feature, which many programmers are eager to integrate into their code—in particular, Linux kernel developers. To this end, the new attributes it introduces need to be supported by the GCC compiler, among others.
The attributes’ syntax allow the programmer to specify either a single identifier or a simple expression. Each identifier in the attribute must be a member of the containing struct:
1 size_t len, size;
2 struct X {
3 size_t len;
4 int *buf __counted_by(len); // refers to 'len' on line 3.
5 };
6
7 struct Y {
8 size_t len;
9 size_t size;
10 void *buf __sized_by(len * size); // 'len' & 'size' on lines 8 & 9.
11 };
The main issue is that GCC’s C compiler isn’t able to perform general delayed parsing of expressions, which is required to support how expressions within these attributes are parsed, because C only requires a single-pass parser.
Another issue is that (by design) non-struct identifiers aren’t accessible within the attributes. This precludes the use of enums, constant variables, and non-const variables whose values never change.
Explanation
The issue at hand is that C has no scoping rule that applies to structures. And adding non-codified scoping isn’t just an overreach for a set of features, but many compilers, GCC included, simply aren’t able to add such resolution rules. Thus, other compilers won’t be able to perform the general delayed parsing of expressions required to support this feature.
To resolve this issue, we need a way to signify to the compiler where an identifier is declared. One way is to add a forward declaration of struct fields to indicate to the compiler that said identifiers are fields within the struct. Any identifiers not found within the forward declaration list are resolved using C’s normal name lookup rules.
enum { PADDING = 42 };
struct X {
int *buf __counted_by(size_t len; len + PADDING);
size_t len;
};
Note: The type of the forward declared identifier must be the same as the type in the struct.
Even with this syntax change, there still exists some ambiguity of which form of the attribute to use. For example, an expression (len)
(parens included) would indicate to the parser that the programmer wants to use the expression version of the attribute rather than the single identifier version. Other cases may require look-ahead tokens before the parser can determine which version to use (e.g. len + 0
). To remove this ambiguity, we append a _expr
to the attribute name. For example:
enum { PADDING = 42 };
size_t size;
struct A {
int *buf __counted_by_expr(size_t len; len * size + PADDING);
size_t len;
};
const size_t len;
struct B {
int *buf __counted_by_expr(len * size + PADDING);
};
Proposal
The “single identifier” syntax is supported by both Clang and GCC (with some minor changes to lookup rules), so this proposal focuses on expression handling.
We propose appending the suffix “_expr
” for attributes that take expressions and adding forward declarations of struct identifiers used within the expression. (We’ll use the counted_by
attribute as an example.)
counted_by_expr ::= 'counted_by_expr' '(' <decl_list> <expr> ')'
decl_list ::= /* empty */ | <decl> ';' <decl_list>
decl ::= <type> <identifier>
All forward declared identifiers must exist within the containing struct and must have the same type. Any identifiers that weren’t forward declared are resolved using normal C lookup rules.
An alternate syntax, that avoids adding forward declarations, is to use the “designator syntax” for struct members:
__counted_by_expr ( .len + PADDING );
This syntax is used for designated initializers in both C and C++, so it already has some support within the languages. It’s also compact and has a fairly clean syntax.
Some downsides are that the preceding dot (.
) is easier to miss both reading and writing.
C++ Compatibility
The forward declaration syntax isn’t necessary for C++, because of C++'s scoping rules. It’s easier to be explicit by using this->
and ::
. When parsing a struct for C, the C++ parser will of course need to verify that the correct forward declarations are made.
Example Usage
Here are some examples of how the proposed changes act.
These two examples are equivalent:
counted_by (len)
counted_by_expr (size_t len; len)
These two examples are not equivalent:
counted_by (len) // refers to 'len' in the struct
counted_by_expr (len) // refers to 'len' outside the struct
For this example, len
and scale
in the attribute reference the non-struct variables.
constexpr int len = 20;
constexpr int scale = 4;
struct s {
int scale;
int len;
int *buf __counted_by_expr (len * scale);
};
The forward declaration isn’t added if no struct fields are used. The following would result in an error, because the compiler would parse a type, expecting a forward declaration, but it wouldn’t find one.
typedef unsigned short TY;
struct A {
int TY;
int *buf __counted_by_expr(TY);
};