[RFC] New attribute `annotate_type`

I would like to propose a new attribute annotate_type that would be analogous to the existing annotate attribute but for use on types. As for annotate, the typical use case would be to add annotations to be used by static analysis tools.

I have a draft patch implementing this attribute (https://reviews.llvm.org/D111548), but before it’s reviewed I wanted to solicit some feedback more broadly on whether people feel this makes sense.

Thanks,

Martin

I would like to propose a new attribute `annotate_type` that would be analogous to the existing `annotate` attribute but for use on types. As for `annotate`, the typical use case would be to add annotations to be used by static analysis tools.

I have a draft patch implementing this attribute (⚙ D111548 [Clang] Add the `annotate_type` attribute), but before it's reviewed I wanted to solicit some feedback more broadly on whether people feel this makes sense.

Thanks for the proposal! One question I have is whether we need an
additional attribute for this instead of reworking [[clang::annotate]]
so that it applies to either a type or a declaration. Is a separate
attribute necessary because there may be unfortunate problems with
using __attribute__((annotate)) as the spelling?

What kind of type semantic impacts does this attribute have on the
type identity for things like overloading, name mangling, type
conversion, etc?

Also, one problem here is -- [[clang::annotate]] isn't just used for
the static analyzer, it's also something that can be used by backend
tools (for example, when using attribute plugins). Do we need to
consider what information should be lowered to LLVM IR when this new
attribute appears on a type?

Thanks!

~Aaron

I would like to propose a new attribute annotate_type that would be analogous to the existing annotate attribute but for use on types. As for annotate, the typical use case would be to add annotations to be used by static analysis tools.

I have a draft patch implementing this attribute (https://reviews.llvm.org/D111548), but before it’s reviewed I wanted to solicit some feedback more broadly on whether people feel this makes sense.

Thanks for the proposal! One question I have is whether we need an
additional attribute for this instead of reworking [[clang::annotate]]
so that it applies to either a type or a declaration. Is a separate
attribute necessary because there may be unfortunate problems with
using attribute((annotate)) as the spelling?

Yes, that’s exactly the problem. Today, the GNU spelling can be applied both before and after the type name:

attribute((annotate(“foo”))) int i1;
int attribute((annotate(“foo”))) i2;

We would need to reinterpret the second variant to refer (or “appertain”) to the decl-specifier-seq, not to the declaration (which is, IIUC, what the C++ standard would prescribe if this was a C++ attribute). But as it’s possible that this variant is already being used “in the wild” with the intent that it should refer to the declaration, I think this isn’t something we can change.

Hence, the practical solution seems to be to introduce a separate attribute for types, for which it is then unambiguous what it should apply to.

As a side note, IIUC, Clang currently more generally lumps GNU attributes that occur in these two positions together, right? I’m looking at Parser::ParseSimpleDeclaration(), specifically this line:

DS.takeAttributesFrom(Attrs);

For C++ attributes, I believe this would not be correct, but that’s not (currently) a concern because AFAICT Clang currently doesn’t allow C++ attributes to occur after a decl-specifier-seq at all (because, I presume, Clang doesn’t yet support any C++ attributes that can occur in this position). This also means that if we want to use the proposed annotate_type attribute in this position, we have to use the GNU spelling. This is an unfortunate niggle, but teaching Clang to correctly process C++ attributes in this position seems like a non-trivial undertaking because it would require us to treat GNU spellings and C++ spellings differently in Parser::ParseSimpleDeclaration(), which seems like it would be a bit of a headache.

What kind of type semantic impacts does this attribute have on the
type identity for things like overloading, name mangling, type
conversion, etc?

Good point. The intent is that the attribute should have no impact. Anything else would be pretty problematic from a portability point of view because other compilers will simply ignore the attribute – so if it had an effect on the type system in Clang, that could cause other compilers to generate different code.

Also, one problem here is – [[clang::annotate]] isn’t just used for
the static analyzer, it’s also something that can be used by backend
tools (for example, when using attribute plugins).

It’s good that you bring this up because I’m actually also thinking about extending ParsedAttrInfo to handle type attributes (with a new handleTypeAttribute() function as an analog to handleDeclAttribute()). The proposed annotate_type attribute would then also be a natural candidate for an attribute that the new handleTypeAttribute() could add to types.

Is this the concern that you had here or were you thinking about something different?

Do we need to
consider what information should be lowered to LLVM IR when this new
attribute appears on a type?

Thanks!

My intent was that this should have no effect on code generation.

Lowering annotations on declarations to LLVM IR has some obvious use cases, but for annotations on types it’s less obvious how it would be useful to lower these to IR (and also less obvious to me how one would represent these). (Did you have anything specific in mind here?)

This could actually make another case for why we have two different attributes, because that would make the distinction clearer that we’re doing code generation for one but not the other.

Thanks,

Martin

>
> I would like to propose a new attribute `annotate_type` that would be analogous to the existing `annotate` attribute but for use on types. As for `annotate`, the typical use case would be to add annotations to be used by static analysis tools.
>
> I have a draft patch implementing this attribute (⚙ D111548 [Clang] Add the `annotate_type` attribute), but before it's reviewed I wanted to solicit some feedback more broadly on whether people feel this makes sense.

Thanks for the proposal! One question I have is whether we need an
additional attribute for this instead of reworking [[clang::annotate]]
so that it applies to either a type or a declaration. Is a separate
attribute necessary because there may be unfortunate problems with
using __attribute__((annotate)) as the spelling?

Yes, that's exactly the problem. Today, the GNU spelling can be applied both before and after the type name:

  __attribute__((annotate("foo"))) int i1;
  int __attribute__((annotate("foo"))) i2;

We would need to reinterpret the second variant to refer (or "appertain") to the decl-specifier-seq, not to the declaration (which is, IIUC, what the C++ standard would prescribe if this was a C++ attribute). But as it's possible that this variant is already being used "in the wild" with the intent that it should refer to the declaration, I think this isn't something we can change.

Agreed, thanks for the confirmation that this was the reason why we
need a second attribute.

Hence, the practical solution seems to be to introduce a separate attribute for types, for which it is then unambiguous what it should apply to.

Another possible solution would be to not support
__attribute__((annotate)) as a type attribute (e.g., you can use
[[clang::annotate]] as a declaration or a type attribute, but we would
restrict __attribute__((annotate)) to only ever be a declaration
attribute.).

As a side note, IIUC, Clang currently more generally lumps GNU attributes that occur in these two positions together, right? I'm looking at Parser::ParseSimpleDeclaration(), specifically this line:

DS.takeAttributesFrom(Attrs);

For C++ attributes, I believe this would not be correct, but that's not (currently) a concern because AFAICT Clang currently doesn't allow C++ attributes to occur after a decl-specifier-seq at all (because, I presume, Clang doesn't yet support any C++ attributes that can occur in this position).

We currently don't support any attributes that appertain to the
declaration specifiers (in any spelling mode). GNU attributes will
"slide" to the type or the declaration based on the name of the
attribute when doing semantic processing for them (see the
moveAttrFromListToList() calls in SemaType.cpp for some examples of
this).

This also means that if we want to use the proposed `annotate_type` attribute in this position, we have to use the GNU spelling.

I don't think that's correct. We certainly support type attributes
using a C++ spelling already.

This is an unfortunate niggle, but teaching Clang to correctly process C++ attributes in this position seems like a non-trivial undertaking because it would require us to treat GNU spellings and C++ spellings differently in Parser::ParseSimpleDeclaration(), which seems like it would be a bit of a headache.

I don't think we'll have to go down that route, hopefully.

What kind of type semantic impacts does this attribute have on the
type identity for things like overloading, name mangling, type
conversion, etc?

Good point. The intent is that the attribute should have no impact. Anything else would be pretty problematic from a portability point of view because other compilers will simply ignore the attribute -- so if it had an effect on the type system in Clang, that could cause other compilers to generate different code.

Vendor-specific attributes are inherently not portable or safe to
ignore. Generally speaking, if a type attribute has no type semantics,
it probably shouldn't be a type attribute. However, because this
attribute doesn't really have any semantics beyond "keep this
information associated with the type", it's a bit more complex.
Consider:

void func(int i);
void func(int [[clang::annotate("hoo boy")]] i);

I can see an argument that this is not a valid overload set because
both functions accept an integer. However:

void func(int i) [[clang::annotate("ABI=v1")]];
void func(int i) [[clang::annotate("ABI=v2")]];

I can see an argument that this is a valid overload set because
something else (the linker, whatever) is using that annotated
information to provide additional semantics beyond what the compiler
cares about.

Also, one problem here is -- [[clang::annotate]] isn't just used for
the static analyzer, it's also something that can be used by backend
tools (for example, when using attribute plugins).

It's good that you bring this up because I'm actually also thinking about extending ParsedAttrInfo to handle type attributes (with a new handleTypeAttribute() function as an analog to handleDeclAttribute()). The proposed annotate_type attribute would then also be a natural candidate for an attribute that the new handleTypeAttribute() could add to types.

Oh, neat! Unifying type/stmt/decl attributes more is definitely a great goal.

Is this the concern that you had here or were you thinking about something different?

Not really, it was more along the lines of lowering to LLVM IR (below)
so that the backend can also take advantage of this.

Do we need to
consider what information should be lowered to LLVM IR when this new
attribute appears on a type?

Thanks!

My intent was that this should have no effect on code generation.

Lowering annotations on declarations to LLVM IR has some obvious use cases, but for annotations on types it's less obvious how it would be useful to lower these to IR (and also less obvious to me how one would represent these). (Did you have anything specific in mind here?)

LLVM currently has the ability to describe some types (e.g.,
`%struct.X = type { i32 }`), so I was envisioning attaching this
attribute to the type (either as an LLVM attribute or metadata, I have
no idea what the distinction really is between them). LLVM currently
supports plugins as does Clang, and Clang's support for attributes in
plugins still strongly encourages users to use existing attributes to
surface functionality, so I can imagine users wanting to use the
annotate type attribute in the same way they already use the annotate
declaration attribute for this sort of collusion. However, I don't
think the codegen needs to happen up front -- mostly trying to
understand the shape of the design still.

~Aaron

I would like to propose a new attribute annotate_type that would be analogous to the existing annotate attribute but for use on types. As for annotate, the typical use case would be to add annotations to be used by static analysis tools.

I have a draft patch implementing this attribute (https://reviews.llvm.org/D111548), but before it’s reviewed I wanted to solicit some feedback more broadly on whether people feel this makes sense.

Thanks for the proposal! One question I have is whether we need an
additional attribute for this instead of reworking [[clang::annotate]]
so that it applies to either a type or a declaration. Is a separate
attribute necessary because there may be unfortunate problems with
using attribute((annotate)) as the spelling?

Yes, that’s exactly the problem. Today, the GNU spelling can be applied both before and after the type name:

attribute((annotate(“foo”))) int i1;
int attribute((annotate(“foo”))) i2;

We would need to reinterpret the second variant to refer (or “appertain”) to the decl-specifier-seq, not to the declaration (which is, IIUC, what the C++ standard would prescribe if this was a C++ attribute). But as it’s possible that this variant is already being used “in the wild” with the intent that it should refer to the declaration, I think this isn’t something we can change.

Agreed, thanks for the confirmation that this was the reason why we
need a second attribute.

Hence, the practical solution seems to be to introduce a separate attribute for types, for which it is then unambiguous what it should apply to.

Another possible solution would be to not support
attribute((annotate)) as a type attribute (e.g., you can use
[[clang::annotate]] as a declaration or a type attribute, but we would
restrict attribute((annotate)) to only ever be a declaration
attribute.).

I guess that would be an option too, though I think this would require some changes to Clang to allow C++ attributes to be used in this position:

int [[clang::annotate]] foo;

See also discussion below.

As a side note, IIUC, Clang currently more generally lumps GNU attributes that occur in these two positions together, right? I’m looking at Parser::ParseSimpleDeclaration(), specifically this line:

DS.takeAttributesFrom(Attrs);

For C++ attributes, I believe this would not be correct, but that’s not (currently) a concern because AFAICT Clang currently doesn’t allow C++ attributes to occur after a decl-specifier-seq at all (because, I presume, Clang doesn’t yet support any C++ attributes that can occur in this position).

We currently don’t support any attributes that appertain to the
declaration specifiers (in any spelling mode).

But one of the enumerators in TypeAttrLocation is this:

/// The attribute is in the decl-specifier-seq.
TAL_DeclSpec,

And it looks as if it is used?

GNU attributes will
“slide” to the type or the declaration based on the name of the
attribute when doing semantic processing for them (see the
moveAttrFromListToList() calls in SemaType.cpp for some examples of
this).

Thanks, that’s an interesting pointer.

I wondering now though: What do attributes attach to when the “sliding” logic doesn’t apply to them (because it doesn’t know them)?

Maybe I should clarify my use case (which will probably benefit the whole discusssion). What I’m looking to be able to do is to attach attributes to different levels of a potentially multi-level pointer. For example:

int attribute((annotate_type(“A”))) * attribute((annotate_type(“B”))) * attribute((annotate_type(“C”))) pp;

The “A” annotation is what I mean by being able to attach an attribute to the decl specifiers (maybe my terminology isn’t accurate here?).

One use case for this is being able to specify which of the levels a nullability annotation refers to and not just have it always implicitly refer to the outermost pointer, as I believe the existing nullability annotations do. (See the Clang fork here for an example that does something very similar: https://github.com/sampsyo/quala)

Staying with the nullability example, it’s important to be able to attach an annotation to the base type itself – imagine the int in the example above was instead a unique_ptr.

This does seem to be possible to some extent with Clang today because if I feed the example above to the code in my draft change, I get AttributedTypeLocs for the types int, int *, and int **, with annotations “A”, “B”, and “C” respectively.

This also means that if we want to use the proposed annotate_type attribute in this position, we have to use the GNU spelling.

I don’t think that’s correct. We certainly support type attributes
using a C++ spelling already.

I think this only works though if the attribute is applied to a declarator, but not to the decl specifiers? Here’s an excerpt from the test in my draft change (https://reviews.llvm.org/D111548):

int attribute((annotate_type(“bar”))) w;
int [[clang::annotate_type(“bar”)]] w2; // expected-error {{‘annotate_type’ attribute cannot be applied to types}}

int *attribute((annotate_type(“bar”))) x;
int *[[clang::annotate_type(“bar”)]] x2;

The error seems to be coming from the following chunk of code in Parser::ParseDeclarationSpecifiers(), which is specific to C++ attributes:

// Reject C++11 attributes that appertain to decl specifiers as
// we don’t support any C++11 attributes that appertain to decl
// specifiers. This also conforms to what g++ 4.8 is doing.
ProhibitCXX11Attributes(attrs, diag::err_attribute_not_type_attr);

This is an unfortunate niggle, but teaching Clang to correctly process C++ attributes in this position seems like a non-trivial undertaking because it would require us to treat GNU spellings and C++ spellings differently in Parser::ParseSimpleDeclaration(), which seems like it would be a bit of a headache.

I don’t think we’ll have to go down that route, hopefully.

What kind of type semantic impacts does this attribute have on the
type identity for things like overloading, name mangling, type
conversion, etc?

Good point. The intent is that the attribute should have no impact. Anything else would be pretty problematic from a portability point of view because other compilers will simply ignore the attribute – so if it had an effect on the type system in Clang, that could cause other compilers to generate different code.

Vendor-specific attributes are inherently not portable or safe to
ignore.

But for C++ attributes at least, the standard says that compilers should do exactly this (i.e. ignore unknown attributes)?

Generally speaking, if a type attribute has no type semantics,
it probably shouldn’t be a type attribute. However, because this
attribute doesn’t really have any semantics beyond “keep this
information associated with the type”, it’s a bit more complex.
Consider:

void func(int i);
void func(int [[clang::annotate(“hoo boy”)]] i);

I can see an argument that this is not a valid overload set because
both functions accept an integer.

I think nullability attributes are a good analogy here (and essentially “prior art”). Here’s an example:

void func(int* i) {}
void func(int* _Nonnull i) {}

(https://godbolt.org/z/sP6za4WEE)

Clang doesn’t consider these to be overloads and instead complains about a redefinition.

I can certainly see how for different use cases you would want an attribute to create a different type – but I would say those would then require a second, separate attribute.

However:

void func(int i) [[clang::annotate(“ABI=v1”)]];
void func(int i) [[clang::annotate(“ABI=v2”)]];

I can see an argument that this is a valid overload set because
something else (the linker, whatever) is using that annotated
information to provide additional semantics beyond what the compiler
cares about.

As a data point that my be useful, I looked at what Clang does when I move those attributes to a position where they are declaration attributes (so that I can run the current Clang on them), i.e.:

[[clang::annotate(“ABI=v1”)]] void func(int i) {}
[[clang::annotate(“ABI=v2”)]] void func(int i) {}

(https://godbolt.org/z/fo5vrso6d)

Again, Clang doesn’t consider these to be overloads and instead complains about a redefinition.

So it seems that the existing semantics of the annotate attribute are similar to what I’m proposing for the annotate_type attribute.

Also, one problem here is – [[clang::annotate]] isn’t just used for
the static analyzer, it’s also something that can be used by backend
tools (for example, when using attribute plugins).

It’s good that you bring this up because I’m actually also thinking about extending ParsedAttrInfo to handle type attributes (with a new handleTypeAttribute() function as an analog to handleDeclAttribute()). The proposed annotate_type attribute would then also be a natural candidate for an attribute that the new handleTypeAttribute() could add to types.

Oh, neat! Unifying type/stmt/decl attributes more is definitely a great goal.

Is this the concern that you had here or were you thinking about something different?

Not really, it was more along the lines of lowering to LLVM IR (below)
so that the backend can also take advantage of this.

Do we need to
consider what information should be lowered to LLVM IR when this new
attribute appears on a type?

Thanks!

My intent was that this should have no effect on code generation.

Lowering annotations on declarations to LLVM IR has some obvious use cases, but for annotations on types it’s less obvious how it would be useful to lower these to IR (and also less obvious to me how one would represent these). (Did you have anything specific in mind here?)

LLVM currently has the ability to describe some types (e.g.,
%struct.X = type { i32 }), so I was envisioning attaching this
attribute to the type (either as an LLVM attribute or metadata, I have
no idea what the distinction really is between them). LLVM currently
supports plugins as does Clang, and Clang’s support for attributes in
plugins still strongly encourages users to use existing attributes to
surface functionality, so I can imagine users wanting to use the
annotate type attribute in the same way they already use the annotate
declaration attribute for this sort of collusion. However, I don’t
think the codegen needs to happen up front – mostly trying to
understand the shape of the design still.

Makes sense.

Sorry that this has become such a long email, but I hope I’ve been able to give you a clearer picture of what I’m trying to achieve.

Cheers,

Martin

>
>
>
>>
>> >
>> > I would like to propose a new attribute `annotate_type` that would be analogous to the existing `annotate` attribute but for use on types. As for `annotate`, the typical use case would be to add annotations to be used by static analysis tools.
>> >
>> > I have a draft patch implementing this attribute (⚙ D111548 [Clang] Add the `annotate_type` attribute), but before it's reviewed I wanted to solicit some feedback more broadly on whether people feel this makes sense.
>>
>> Thanks for the proposal! One question I have is whether we need an
>> additional attribute for this instead of reworking [[clang::annotate]]
>> so that it applies to either a type or a declaration. Is a separate
>> attribute necessary because there may be unfortunate problems with
>> using __attribute__((annotate)) as the spelling?
>
>
> Yes, that's exactly the problem. Today, the GNU spelling can be applied both before and after the type name:
>
> __attribute__((annotate("foo"))) int i1;
> int __attribute__((annotate("foo"))) i2;
>
> We would need to reinterpret the second variant to refer (or "appertain") to the decl-specifier-seq, not to the declaration (which is, IIUC, what the C++ standard would prescribe if this was a C++ attribute). But as it's possible that this variant is already being used "in the wild" with the intent that it should refer to the declaration, I think this isn't something we can change.

Agreed, thanks for the confirmation that this was the reason why we
need a second attribute.

> Hence, the practical solution seems to be to introduce a separate attribute for types, for which it is then unambiguous what it should apply to.

Another possible solution would be to not support
__attribute__((annotate)) as a type attribute (e.g., you can use
[[clang::annotate]] as a declaration or a type attribute, but we would
restrict __attribute__((annotate)) to only ever be a declaration
attribute.).

I guess that would be an option too, though I think this would require some changes to Clang to allow C++ attributes to be used in this position:

int [[clang::annotate]] foo;

Clang already supports type attributes in that position today. e.g.,

The position Clang doesn't currently support is on decl specifiers.
e.g., Compiler Explorer

See also discussion below.

> As a side note, IIUC, Clang currently more generally lumps GNU attributes that occur in these two positions together, right? I'm looking at Parser::ParseSimpleDeclaration(), specifically this line:
>
> DS.takeAttributesFrom(Attrs);
>
> For C++ attributes, I believe this would not be correct, but that's not (currently) a concern because AFAICT Clang currently doesn't allow C++ attributes to occur after a decl-specifier-seq at all (because, I presume, Clang doesn't yet support any C++ attributes that can occur in this position).

We currently don't support any attributes that appertain to the
declaration specifiers (in any spelling mode).

But one of the enumerators in TypeAttrLocation is this:

  /// The attribute is in the decl-specifier-seq.
  TAL_DeclSpec,

And it looks as if it is used?

Ah, yeah, this is confusing stuff. decl-specifier-seq can have
attributes ([dcl.spec]) and
TAL_DeclSpec tracks that. However, we do not currently support
attributes on *all kinds of decl specifiers*. e.g., [[supported #1]]
static [[not supported #2]] int [[supported #3]] i [[supported #4]];

We support the ones that are in the declaration position (#1), the
type position (#2), and the identifier position (#4) but do not
support ones on storage class specifiers, function specifiers, or
other kinds of decl specifiers other than a defining-type-specifier.
e.g., Compiler Explorer

GNU attributes will
"slide" to the type or the declaration based on the name of the
attribute when doing semantic processing for them (see the
moveAttrFromListToList() calls in SemaType.cpp for some examples of
this).

Thanks, that's an interesting pointer.

I wondering now though: What do attributes attach to when the "sliding" logic doesn't apply to them (because it doesn't know them)?

Heh, now you're seeing why GNU attributes were replaced by what was
standardized in C and C++. The answer is: we don't know what to do and
this can sometimes even impact our ability to parse properly (for
example, if it's a type attribute in C, it may be the difference
between ANSI "implicit int" being allowed or not). However, in Clang,
we do not retain unknown attributes in the AST, so we parse as best we
can and then drop what we parsed onto the floor if the attribute is
unknown.

Maybe I should clarify my use case (which will probably benefit the whole discusssion). What I'm looking to be able to do is to attach attributes to different levels of a potentially multi-level pointer. For example:

int __attribute__((annotate_type("A"))) * __attribute__((annotate_type("B"))) * __attribute__((annotate_type("C"))) pp;

The "A" annotation is what I mean by being able to attach an attribute to the decl specifiers (maybe my terminology isn't accurate here?).

One use case for this is being able to specify which of the levels a nullability annotation refers to and not just have it always implicitly refer to the outermost pointer, as I believe the existing nullability annotations do. (See the Clang fork here for an example that does something very similar: GitHub - sampsyo/quala: custom type systems for Clang)

Staying with the nullability example, it's important to be able to attach an annotation to the base type itself -- imagine the `int` in the example above was instead a `unique_ptr`.

This does seem to be possible to some extent with Clang today because if I feed the example above to the code in my draft change, I get `AttributedTypeLoc`s for the types `int`, `int *`, and `int **`, with annotations "A", "B", and "C" respectively.

It is possible today, but I note that we might have a bug with the
[] implementation: Compiler Explorer -- I'd have to
dig around more to see what's going on there.

> This also means that if we want to use the proposed `annotate_type` attribute in this position, we have to use the GNU spelling.

I don't think that's correct. We certainly support type attributes
using a C++ spelling already.

I think this only works though if the attribute is applied to a declarator, but not to the decl specifiers? Here's an excerpt from the test in my draft change (⚙ D111548 [Clang] Add the `annotate_type` attribute):

  int __attribute__((annotate_type("bar"))) w;
  int [[clang::annotate_type("bar")]] w2; // expected-error {{'annotate_type' attribute cannot be applied to types}}

  int *__attribute__((annotate_type("bar"))) x;
  int *[[clang::annotate_type("bar")]] x2;

The error seems to be coming from the following chunk of code in Parser::ParseDeclarationSpecifiers(), which is specific to C++ attributes:

        // Reject C++11 attributes that appertain to decl specifiers as
        // we don't support any C++11 attributes that appertain to decl
        // specifiers. This also conforms to what g++ 4.8 is doing.
        ProhibitCXX11Attributes(attrs, diag::err_attribute_not_type_attr);

Oh hey, this is the possible bug I just noted above. :smiley: I think we
should not be rejecting if the decl specifier is a defining type
specifier because that has utility, but I'd have to go stare at the
standard a bit more to be sure. Declarator parsing is not the easiest
thing to reason about.

> This is an unfortunate niggle, but teaching Clang to correctly process C++ attributes in this position seems like a non-trivial undertaking because it would require us to treat GNU spellings and C++ spellings differently in Parser::ParseSimpleDeclaration(), which seems like it would be a bit of a headache.

I don't think we'll have to go down that route, hopefully.

>> What kind of type semantic impacts does this attribute have on the
>> type identity for things like overloading, name mangling, type
>> conversion, etc?
>
> Good point. The intent is that the attribute should have no impact. Anything else would be pretty problematic from a portability point of view because other compilers will simply ignore the attribute -- so if it had an effect on the type system in Clang, that could cause other compilers to generate different code.

Vendor-specific attributes are inherently not portable or safe to
ignore.

But for C++ attributes at least, the standard says that compilers should do exactly this (i.e. ignore unknown attributes)?

Same for C. The standard says that vendor-specific attributes are
fully implementation-defined (so they're permitted to have whatever
semantics the implementation wants, including necessary side effects).
The standard also says that if the implementation doesn't know about
an attribute, the implementation should ignore the attribute. The
combination of these rules is what makes vendor-specific attributes
inherently nonportable or safe to ignore.

As an example: Compiler Explorer

Generally speaking, if a type attribute has no type semantics,
it probably shouldn't be a type attribute. However, because this
attribute doesn't really have any semantics beyond "keep this
information associated with the type", it's a bit more complex.
Consider:

void func(int i);
void func(int [[clang::annotate("hoo boy")]] i);

I can see an argument that this is not a valid overload set because
both functions accept an integer.

I think nullability attributes are a good analogy here (and essentially "prior art"). Here's an example:

void func(int* i) {}
void func(int* _Nonnull i) {}

(Compiler Explorer)

Clang doesn't consider these to be overloads and instead complains about a redefinition.

I think modeling on the nullability attributes may not be a bad idea,
but note that those are surfaced to the user as keywords and not
attributes at all. I believe that was done specifically because the
nullability attributes had type semantics, but I could be remembering
incorrectly.

I can certainly see how for different use cases you _would_ want an attribute to create a different type -- but I would say those would then require a second, separate attribute.

However:

void func(int i) [[clang::annotate("ABI=v1")]];
void func(int i) [[clang::annotate("ABI=v2")]];

I can see an argument that this is a valid overload set because
something else (the linker, whatever) is using that annotated
information to provide additional semantics beyond what the compiler
cares about.

As a data point that my be useful, I looked at what Clang does when I move those attributes to a position where they are declaration attributes (so that I can run the current Clang on them), i.e.:

[[clang::annotate("ABI=v1")]] void func(int i) {}
[[clang::annotate("ABI=v2")]] void func(int i) {}

(Compiler Explorer)

Again, Clang doesn't consider these to be overloads and instead complains about a redefinition.

FWIW, that's because declaration attributes do not impact the type of
the function. We do have a declaration attribute that says "this
declaration is allowed to be overloaded" in C, but that's subtly a
different thing.

So it seems that the existing semantics of the `annotate` attribute are similar to what I'm proposing for the `annotate_type` attribute.

>> Also, one problem here is -- [[clang::annotate]] isn't just used for
>> the static analyzer, it's also something that can be used by backend
>> tools (for example, when using attribute plugins).
>
>
> It's good that you bring this up because I'm actually also thinking about extending ParsedAttrInfo to handle type attributes (with a new handleTypeAttribute() function as an analog to handleDeclAttribute()). The proposed annotate_type attribute would then also be a natural candidate for an attribute that the new handleTypeAttribute() could add to types.

Oh, neat! Unifying type/stmt/decl attributes more is definitely a great goal.

> Is this the concern that you had here or were you thinking about something different?

Not really, it was more along the lines of lowering to LLVM IR (below)
so that the backend can also take advantage of this.

>> Do we need to
>> consider what information should be lowered to LLVM IR when this new
>> attribute appears on a type?
>>
>> Thanks!
>
> My intent was that this should have no effect on code generation.
>
> Lowering annotations on declarations to LLVM IR has some obvious use cases, but for annotations on types it's less obvious how it would be useful to lower these to IR (and also less obvious to me how one would represent these). (Did you have anything specific in mind here?)

LLVM currently has the ability to describe some types (e.g.,
`%struct.X = type { i32 }`), so I was envisioning attaching this
attribute to the type (either as an LLVM attribute or metadata, I have
no idea what the distinction really is between them). LLVM currently
supports plugins as does Clang, and Clang's support for attributes in
plugins still strongly encourages users to use existing attributes to
surface functionality, so I can imagine users wanting to use the
annotate type attribute in the same way they already use the annotate
declaration attribute for this sort of collusion. However, I don't
think the codegen needs to happen up front -- mostly trying to
understand the shape of the design still.

Makes sense.

Sorry that this has become such a long email, but I hope I've been able to give you a clearer picture of what I'm trying to achieve.

No worries, I appreciate the detailed discussion of the design!

I think we should double-check that I'm correct about the bug with
Clang accepting:

static int [] i;

but rejecting:

int [] i;

If we're agreed that this is a bug, then it seems like once we fix
that bug, you could support [[clang::annotate]] in type positions
(with no type semantics) as well as declarations, but only if you were
willing to define __attribute__((annotate)) as applying to
declarations when there's an ambiguity between type vs decl. Would
that be a palatable approach for you?

~Aaron

Just a brief response to say: Thank you for the detailed discussion! All of this sounds very helpful. Didn’t have time to dig more deeply on Thursday or today, but I hope to do so on Monday and will reply in detail once I’ve done so.

Sorry, this is going to be a long reply again with lots of individual comments – but to front-load the conclusion (which should also give some detail for all the detailed comments):

If we can make [[clang::annotate]] work in type positions, then I would definitely find that palatable and probably even preferable! I think it may be non-trivial to implement though.

I would like to propose a new attribute annotate_type that would be analogous to the existing annotate attribute but for use on types. As for annotate, the typical use case would be to add annotations to be used by static analysis tools.

I have a draft patch implementing this attribute (https://reviews.llvm.org/D111548), but before it’s reviewed I wanted to solicit some feedback more broadly on whether people feel this makes sense.

Thanks for the proposal! One question I have is whether we need an
additional attribute for this instead of reworking [[clang::annotate]]
so that it applies to either a type or a declaration. Is a separate
attribute necessary because there may be unfortunate problems with
using attribute((annotate)) as the spelling?

Yes, that’s exactly the problem. Today, the GNU spelling can be applied both before and after the type name:

attribute((annotate(“foo”))) int i1;
int attribute((annotate(“foo”))) i2;

We would need to reinterpret the second variant to refer (or “appertain”) to the decl-specifier-seq, not to the declaration (which is, IIUC, what the C++ standard would prescribe if this was a C++ attribute). But as it’s possible that this variant is already being used “in the wild” with the intent that it should refer to the declaration, I think this isn’t something we can change.

Agreed, thanks for the confirmation that this was the reason why we
need a second attribute.

Hence, the practical solution seems to be to introduce a separate attribute for types, for which it is then unambiguous what it should apply to.

Another possible solution would be to not support
attribute((annotate)) as a type attribute (e.g., you can use
[[clang::annotate]] as a declaration or a type attribute, but we would
restrict attribute((annotate)) to only ever be a declaration
attribute.).

I guess that would be an option too, though I think this would require some changes to Clang to allow C++ attributes to be used in this position:

int [[clang::annotate]] foo;

Clang already supports type attributes in that position today. e.g.,
https://godbolt.org/z/rTxoWWfdb

Quoting the godbolt example to save having to follow the link:

int * [[clang::address_space(126)]] ip = nullptr;

This is different from my example above though because the attribute comes after the *, which makes it part of the declarator or, more specifically, the ptr-operator:

http://eel.is/c++draft/dcl.dcl#nt:ptr-operator

In contrast, with the example int [[clang::annotate]] foo that I was referring to, the attribute is part of the decl-specifier-seq (specifically, it’s an attribute on the defining-type-specifier, as you also discuss below).

The position Clang doesn’t currently support is on decl specifiers.
e.g., https://godbolt.org/z/rG1aP7Tz5

Agreed.

See also discussion below.

As a side note, IIUC, Clang currently more generally lumps GNU attributes that occur in these two positions together, right? I’m looking at Parser::ParseSimpleDeclaration(), specifically this line:

DS.takeAttributesFrom(Attrs);

For C++ attributes, I believe this would not be correct, but that’s not (currently) a concern because AFAICT Clang currently doesn’t allow C++ attributes to occur after a decl-specifier-seq at all (because, I presume, Clang doesn’t yet support any C++ attributes that can occur in this position).

We currently don’t support any attributes that appertain to the
declaration specifiers (in any spelling mode).

But one of the enumerators in TypeAttrLocation is this:

/// The attribute is in the decl-specifier-seq.
TAL_DeclSpec,

And it looks as if it is used?

Ah, yeah, this is confusing stuff. decl-specifier-seq can have
attributes (http://eel.is/c++draft/dcl.spec#nt:decl-specifier-seq) and
TAL_DeclSpec tracks that. However, we do not currently support
attributes on all kinds of decl specifiers. e.g., [[supported #1]]
static [[not supported #2]] int [[supported #3]] i [[supported #4]];

Just to make sure we’re on the same page: Not all of these attributes are part of the decl-specfier-seq – right?

#1 is the attribute-specifier-seq in a simple-declaration: http://eel.is/c++draft/dcl.dcl#nt:simple-declaration

#4 is the attribute-specifier-seq in a noptr-declarator: http://eel.is/c++draft/dcl.dcl#nt:noptr-declarator

We support the ones that are in the declaration position (#1), the
type position (#2),

I think you mean #3?

and the identifier position (#4) but do not
support ones on storage class specifiers, function specifiers, or
other kinds of decl specifiers other than a defining-type-specifier.
e.g., https://godbolt.org/z/8sq5Wo8sT

I think [] is treated differently from known attribute (and unknown attributes may be treated differently yet again?). Anyway, I think we get into this more deeply below anyway.

GNU attributes will
“slide” to the type or the declaration based on the name of the
attribute when doing semantic processing for them (see the
moveAttrFromListToList() calls in SemaType.cpp for some examples of
this).

Thanks, that’s an interesting pointer.

I wondering now though: What do attributes attach to when the “sliding” logic doesn’t apply to them (because it doesn’t know them)?

Heh, now you’re seeing why GNU attributes were replaced by what was
standardized in C and C++. The answer is: we don’t know what to do and
this can sometimes even impact our ability to parse properly (for
example, if it’s a type attribute in C, it may be the difference
between ANSI “implicit int” being allowed or not). However, in Clang,
we do not retain unknown attributes in the AST, so we parse as best we
can and then drop what we parsed onto the floor if the attribute is
unknown.

Hm. Got it.

This does seem to make a pretty strong argument for what you’re suggesting, namely that we shouldn’t support the GNU syntax of the annotate attribute for annotations on types and instead should support only the C++ syntax (where, thankfully, the standard makes it pretty clear what the attribute appertains to).

Maybe I should clarify my use case (which will probably benefit the whole discusssion). What I’m looking to be able to do is to attach attributes to different levels of a potentially multi-level pointer. For example:

int attribute((annotate_type(“A”))) * attribute((annotate_type(“B”))) * attribute((annotate_type(“C”))) pp;

The “A” annotation is what I mean by being able to attach an attribute to the decl specifiers (maybe my terminology isn’t accurate here?).

One use case for this is being able to specify which of the levels a nullability annotation refers to and not just have it always implicitly refer to the outermost pointer, as I believe the existing nullability annotations do. (See the Clang fork here for an example that does something very similar: https://github.com/sampsyo/quala)

Staying with the nullability example, it’s important to be able to attach an annotation to the base type itself – imagine the int in the example above was instead a unique_ptr.

This does seem to be possible to some extent with Clang today because if I feed the example above to the code in my draft change, I get AttributedTypeLocs for the types int, int *, and int **, with annotations “A”, “B”, and “C” respectively.

It is possible today, but I note that we might have a bug with the
[] implementation: https://godbolt.org/z/vr81KzG33 – I’d have to
dig around more to see what’s going on there.

Yes, I think this example makes the problem pretty clear, namely that C++ attributes can’t be applied to the defining-type-specifier today.

There’s an additional wrinkle here in that, IIUC, the address_space attribute only makes sense to apply to a ptr-declarator, not a defining-type-specifier. I think the code that is causing the attribute to be rejected here isn’t specific to address_space though – I think it’s just the general code we discuss below.

This also means that if we want to use the proposed annotate_type attribute in this position, we have to use the GNU spelling.

I don’t think that’s correct. We certainly support type attributes
using a C++ spelling already.

I think this only works though if the attribute is applied to a declarator, but not to the decl specifiers? Here’s an excerpt from the test in my draft change (https://reviews.llvm.org/D111548):

int attribute((annotate_type(“bar”))) w;
int [[clang::annotate_type(“bar”)]] w2; // expected-error {{‘annotate_type’ attribute cannot be applied to types}}

int *attribute((annotate_type(“bar”))) x;
int *[[clang::annotate_type(“bar”)]] x2;

The error seems to be coming from the following chunk of code in Parser::ParseDeclarationSpecifiers(), which is specific to C++ attributes:

// Reject C++11 attributes that appertain to decl specifiers as
// we don’t support any C++11 attributes that appertain to decl
// specifiers. This also conforms to what g++ 4.8 is doing.
ProhibitCXX11Attributes(attrs, diag::err_attribute_not_type_attr);

Oh hey, this is the possible bug I just noted above. :smiley: I think we
should not be rejecting if the decl specifier is a defining type
specifier because that has utility,

Agree – I’d like to add support for this in Clang.

I suspect the reason this isn’t supported today is because none of the existing type attributes need to be applied to the declaring-type-specifier. As noted above, address_space only really makes sense on ptr-declarators. And without having checked, I suspect the other attributes also refer to properties of pointers (which is where I think most of the practical reasons for attributing types comes from).

Having said that, it would be great if the nullability attributes could, for example, be applied to a unique_ptr, but it seems that this isn’t the case (https://godbolt.org/z/qEn68xMz9).

but I’d have to go stare at the
standard a bit more to be sure. Declarator parsing is not the easiest
thing to reason about.

Let me know what you conclude.

From what I’ve seen in the standard (see links that you and I posted above), I think it’s pretty clear where this is supposed to slot into the grammar. But I assume what you’re referring to is more the problem of how to make this parse unambiguously in Clang?

This is an unfortunate niggle, but teaching Clang to correctly process C++ attributes in this position seems like a non-trivial undertaking because it would require us to treat GNU spellings and C++ spellings differently in Parser::ParseSimpleDeclaration(), which seems like it would be a bit of a headache.

I don’t think we’ll have to go down that route, hopefully.

What kind of type semantic impacts does this attribute have on the
type identity for things like overloading, name mangling, type
conversion, etc?

Good point. The intent is that the attribute should have no impact. Anything else would be pretty problematic from a portability point of view because other compilers will simply ignore the attribute – so if it had an effect on the type system in Clang, that could cause other compilers to generate different code.

Vendor-specific attributes are inherently not portable or safe to
ignore.

But for C++ attributes at least, the standard says that compilers should do exactly this (i.e. ignore unknown attributes)?

Same for C. The standard says that vendor-specific attributes are
fully implementation-defined (so they’re permitted to have whatever
semantics the implementation wants, including necessary side effects).
The standard also says that if the implementation doesn’t know about
an attribute, the implementation should ignore the attribute. The
combination of these rules is what makes vendor-specific attributes
inherently nonportable or safe to ignore.

As an example: https://godbolt.org/z/vsv3Wo6En

Thanks, makes sense!

Generally speaking, if a type attribute has no type semantics,
it probably shouldn’t be a type attribute. However, because this
attribute doesn’t really have any semantics beyond “keep this
information associated with the type”, it’s a bit more complex.
Consider:

void func(int i);
void func(int [[clang::annotate(“hoo boy”)]] i);

I can see an argument that this is not a valid overload set because
both functions accept an integer.

I think nullability attributes are a good analogy here (and essentially “prior art”). Here’s an example:

void func(int* i) {}
void func(int* _Nonnull i) {}

(https://godbolt.org/z/sP6za4WEE)

Clang doesn’t consider these to be overloads and instead complains about a redefinition.

I think modeling on the nullability attributes may not be a bad idea,
but note that those are surfaced to the user as keywords and not
attributes at all. I believe that was done specifically because the
nullability attributes had type semantics, but I could be remembering
incorrectly.

Do you think you can dig out this discussion again?

I don’t think it’s really important though – for the [[clang::annotate]] case (assuming that’s what we’ll go with in preference to annotate_type), we certainly want to specify that the annotation does not create a new type.

I can certainly see how for different use cases you would want an attribute to create a different type – but I would say those would then require a second, separate attribute.

However:

void func(int i) [[clang::annotate(“ABI=v1”)]];
void func(int i) [[clang::annotate(“ABI=v2”)]];

I can see an argument that this is a valid overload set because
something else (the linker, whatever) is using that annotated
information to provide additional semantics beyond what the compiler
cares about.

As a data point that my be useful, I looked at what Clang does when I move those attributes to a position where they are declaration attributes (so that I can run the current Clang on them), i.e.:

[[clang::annotate(“ABI=v1”)]] void func(int i) {}
[[clang::annotate(“ABI=v2”)]] void func(int i) {}

(https://godbolt.org/z/fo5vrso6d)

Again, Clang doesn’t consider these to be overloads and instead complains about a redefinition.

FWIW, that’s because declaration attributes do not impact the type of
the function. We do have a declaration attribute that says “this
declaration is allowed to be overloaded” in C, but that’s subtly a
different thing.

So it seems that the existing semantics of the annotate attribute are similar to what I’m proposing for the annotate_type attribute.

Also, one problem here is – [[clang::annotate]] isn’t just used for
the static analyzer, it’s also something that can be used by backend
tools (for example, when using attribute plugins).

It’s good that you bring this up because I’m actually also thinking about extending ParsedAttrInfo to handle type attributes (with a new handleTypeAttribute() function as an analog to handleDeclAttribute()). The proposed annotate_type attribute would then also be a natural candidate for an attribute that the new handleTypeAttribute() could add to types.

Oh, neat! Unifying type/stmt/decl attributes more is definitely a great goal.

Is this the concern that you had here or were you thinking about something different?

Not really, it was more along the lines of lowering to LLVM IR (below)
so that the backend can also take advantage of this.

Do we need to
consider what information should be lowered to LLVM IR when this new
attribute appears on a type?

Thanks!

My intent was that this should have no effect on code generation.

Lowering annotations on declarations to LLVM IR has some obvious use cases, but for annotations on types it’s less obvious how it would be useful to lower these to IR (and also less obvious to me how one would represent these). (Did you have anything specific in mind here?)

LLVM currently has the ability to describe some types (e.g.,
%struct.X = type { i32 }), so I was envisioning attaching this
attribute to the type (either as an LLVM attribute or metadata, I have
no idea what the distinction really is between them). LLVM currently
supports plugins as does Clang, and Clang’s support for attributes in
plugins still strongly encourages users to use existing attributes to
surface functionality, so I can imagine users wanting to use the
annotate type attribute in the same way they already use the annotate
declaration attribute for this sort of collusion. However, I don’t
think the codegen needs to happen up front – mostly trying to
understand the shape of the design still.

Makes sense.

Sorry that this has become such a long email, but I hope I’ve been able to give you a clearer picture of what I’m trying to achieve.

No worries, I appreciate the detailed discussion of the design!

I think we should double-check that I’m correct about the bug with
Clang accepting:

static int [] i;

but rejecting:

int [] i;

I don’t think that’s the distinction:

https://godbolt.org/z/MWcojK8n4

(Noting that it accepts both of these.)

Rather, this seems to be a case of [] being treated specially (probably earlier in the parser) and therefore being accepted in this position, while named, known attributes are not accepted here.

If we’re agreed that this is a bug, then it seems like once we fix
that bug, you could support [[clang::annotate]] in type positions
(with no type semantics) as well as declarations, but only if you were
willing to define attribute((annotate)) as applying to
declarations when there’s an ambiguity between type vs decl. Would
that be a palatable approach for you?

Yes, I think that would definitely work for me, and there would be an attraction in not having to introduce an additional annotate_type attribute that otherwise works mostly the same as annotate.

I think making this work may not be as simple as lifting the restriction in Parser::ParseDeclarationSpecifiers() that we discussed above. I think there’s another issue that I touched on previously (https://lists.llvm.org/pipermail/cfe-dev/2021-October/069094.html), namely that Clang currently lumps attributes for the declaration and the decl-specifier-seq together.

This happens in Parser::ParseSimpleDeclaration(), specifically this line
<https://github.com/llvm/llvm-project/blob/main/clang/lib/Parse/ParseDecl.cpp#L1818>
:

DS.takeAttributesFrom(Attrs);

To quote the conclusion I reached at the time:

For C++ attributes, I believe this would not be correct, but that’s not
(currently) a concern because AFAICT Clang currently doesn’t allow C++
attributes to occur after a decl-specifier-seq at all (because, I presume,
Clang doesn’t yet support any C++ attributes that can occur in this
position). This also means that if we want to use the proposed
annotate_type attribute in this position, we have to use the GNU
spelling. This is an unfortunate niggle, but teaching Clang to correctly
process C++ attributes in this position seems like a non-trivial
undertaking because it would require us to treat GNU spellings and C++
spellings differently in Parser::ParseSimpleDeclaration(), which seems like
it would be a bit of a headache.

And I believe this is still true?

Do you have any other ideas on what we could do about this?

Thanks,

Martin

PS I just realized that the final point in my last email wasn’t clear.

What we would want to achieve is for Clang to treat all of the following the same way, namely as an annotation on the declaration:

attribute((annotate)) int i;
int attribute((annotate)) i;
[[clang::annotate]] int i;

While this case should be treated differently, namely as an annotation on the type:

int [[clang::annotate]] i;

This means we’d need to treat the C++ spelling of an attribute differently in this position – and we’d need a way to preserve this distinction between attributes on the declaration and the defining-type-specifier for later use. That seems fiddly… but I’d be willing to take a look and see what we would need to do to make it work.

Sorry, this is going to be a long reply again with lots of individual comments -- but to front-load the conclusion (which should also give some detail for all the detailed comments):

If we can make [[clang::annotate]] work in type positions, then I would definitely find that palatable and probably even preferable! I think it may be non-trivial to implement though.

Excellent, I think we're on the same page with our conclusions.

>
>
>
>>
>> >
>> >
>> >
>> >>
>> >> >
>> >> > I would like to propose a new attribute `annotate_type` that would be analogous to the existing `annotate` attribute but for use on types. As for `annotate`, the typical use case would be to add annotations to be used by static analysis tools.
>> >> >
>> >> > I have a draft patch implementing this attribute (⚙ D111548 [Clang] Add the `annotate_type` attribute), but before it's reviewed I wanted to solicit some feedback more broadly on whether people feel this makes sense.
>> >>
>> >> Thanks for the proposal! One question I have is whether we need an
>> >> additional attribute for this instead of reworking [[clang::annotate]]
>> >> so that it applies to either a type or a declaration. Is a separate
>> >> attribute necessary because there may be unfortunate problems with
>> >> using __attribute__((annotate)) as the spelling?
>> >
>> >
>> > Yes, that's exactly the problem. Today, the GNU spelling can be applied both before and after the type name:
>> >
>> > __attribute__((annotate("foo"))) int i1;
>> > int __attribute__((annotate("foo"))) i2;
>> >
>> > We would need to reinterpret the second variant to refer (or "appertain") to the decl-specifier-seq, not to the declaration (which is, IIUC, what the C++ standard would prescribe if this was a C++ attribute). But as it's possible that this variant is already being used "in the wild" with the intent that it should refer to the declaration, I think this isn't something we can change.
>>
>> Agreed, thanks for the confirmation that this was the reason why we
>> need a second attribute.
>>
>> > Hence, the practical solution seems to be to introduce a separate attribute for types, for which it is then unambiguous what it should apply to.
>>
>> Another possible solution would be to not support
>> __attribute__((annotate)) as a type attribute (e.g., you can use
>> [[clang::annotate]] as a declaration or a type attribute, but we would
>> restrict __attribute__((annotate)) to only ever be a declaration
>> attribute.).
>
> I guess that would be an option too, though I think this would require some changes to Clang to allow C++ attributes to be used in this position:
>
> int [[clang::annotate]] foo;

Clang already supports type attributes in that position today. e.g.,
Compiler Explorer

Quoting the godbolt example to save having to follow the link:

int * [[clang::address_space(126)]] ip = nullptr;

This is different from my example above though because the attribute comes after the `*`, which makes it part of the declarator or, more specifically, the ptr-operator:

[dcl.dcl]

Agreed.

In contrast, with the example `int [[clang::annotate]] foo` that I was referring to, the attribute is part of the decl-specifier-seq (specifically, it's an attribute on the defining-type-specifier, as you also discuss below).

Agreed; it's a bug that Clang currently rejects this. That is a valid
location for a type attribute to do, and we have type attributes. I
think nobody has run into this because our type attributes haven't
applied (as often) to this particular type location. They've been
pointer attributes like address_space and _Nullable or function type
attributes like calling conventions, etc.

The position Clang doesn't currently support is on decl specifiers.
e.g., Compiler Explorer

Agreed.

> See also discussion below.
>
>> > As a side note, IIUC, Clang currently more generally lumps GNU attributes that occur in these two positions together, right? I'm looking at Parser::ParseSimpleDeclaration(), specifically this line:
>> >
>> > DS.takeAttributesFrom(Attrs);
>> >
>> > For C++ attributes, I believe this would not be correct, but that's not (currently) a concern because AFAICT Clang currently doesn't allow C++ attributes to occur after a decl-specifier-seq at all (because, I presume, Clang doesn't yet support any C++ attributes that can occur in this position).
>>
>> We currently don't support any attributes that appertain to the
>> declaration specifiers (in any spelling mode).
>
>
> But one of the enumerators in TypeAttrLocation is this:
>
> /// The attribute is in the decl-specifier-seq.
> TAL_DeclSpec,
>
> And it looks as if it is used?

Ah, yeah, this is confusing stuff. decl-specifier-seq can have
attributes ([dcl.spec]) and
TAL_DeclSpec tracks that. However, we do not currently support
attributes on *all kinds of decl specifiers*. e.g., [[supported #1]]
static [[not supported #2]] int [[supported #3]] i [[supported #4]];

Just to make sure we're on the same page: Not all of these attributes are part of the decl-specfier-seq -- right?

Correct.

#1 is the attribute-specifier-seq in a simple-declaration: [dcl.dcl]

Correct.

#4 is the attribute-specifier-seq in a noptr-declarator: [dcl.dcl]

Correct.

We support the ones that are in the declaration position (#1), the
type position (#2),

I think you mean #3?

Yup, sorry for that confusion! Also I meant to say we *intend* to
support #3 but we don't currently because of bugs. We support #3 more
generally in that we support an attribute after a type.

and the identifier position (#4) but do not
support ones on storage class specifiers, function specifiers, or
other kinds of decl specifiers other than a defining-type-specifier.
e.g., Compiler Explorer

I think [] is treated differently from known attribute (and unknown attributes may be treated differently yet again?). Anyway, I think we get into this more deeply below anyway.

[] is handled in two stages. There's the parsing stage (where can
[] appear *at all*) and the semantic stage (does the attribute
within the [] appertain to the entity it was specified on, is it
supported on this target, etc). Parsing will do special work for known
attributes to parse the expected arguments, etc; for unknown
attributes, parsing eats balanced tokens until the end of the
attribute argument list. The semantic stage is what diagnoses unknown
or ignored attributes.

>> GNU attributes will
>> "slide" to the type or the declaration based on the name of the
>> attribute when doing semantic processing for them (see the
>> moveAttrFromListToList() calls in SemaType.cpp for some examples of
>> this).
>
>
> Thanks, that's an interesting pointer.
>
> I wondering now though: What do attributes attach to when the "sliding" logic doesn't apply to them (because it doesn't know them)?

Heh, now you're seeing why GNU attributes were replaced by what was
standardized in C and C++. The answer is: we don't know what to do and
this can sometimes even impact our ability to parse properly (for
example, if it's a type attribute in C, it may be the difference
between ANSI "implicit int" being allowed or not). However, in Clang,
we do not retain unknown attributes in the AST, so we parse as best we
can and then drop what we parsed onto the floor if the attribute is
unknown.

Hm. Got it.

This does seem to make a pretty strong argument for what you're suggesting, namely that we shouldn't support the GNU syntax of the `annotate` attribute for annotations on types and instead should support only the C++ syntax (where, thankfully, the standard makes it pretty clear what the attribute appertains to).

> Maybe I should clarify my use case (which will probably benefit the whole discusssion). What I'm looking to be able to do is to attach attributes to different levels of a potentially multi-level pointer. For example:
>
> int __attribute__((annotate_type("A"))) * __attribute__((annotate_type("B"))) * __attribute__((annotate_type("C"))) pp;
>
> The "A" annotation is what I mean by being able to attach an attribute to the decl specifiers (maybe my terminology isn't accurate here?).
>
> One use case for this is being able to specify which of the levels a nullability annotation refers to and not just have it always implicitly refer to the outermost pointer, as I believe the existing nullability annotations do. (See the Clang fork here for an example that does something very similar: GitHub - sampsyo/quala: custom type systems for Clang)
>
> Staying with the nullability example, it's important to be able to attach an annotation to the base type itself -- imagine the `int` in the example above was instead a `unique_ptr`.
>
> This does seem to be possible to some extent with Clang today because if I feed the example above to the code in my draft change, I get `AttributedTypeLoc`s for the types `int`, `int *`, and `int **`, with annotations "A", "B", and "C" respectively.

It is possible today, but I note that we might have a bug with the
[] implementation: Compiler Explorer -- I'd have to
dig around more to see what's going on there.

Yes, I think this example makes the problem pretty clear, namely that C++ attributes can't be applied to the defining-type-specifier today.

Yup! They are allowed by the standard, but Clang has an implementation bug here.

There's an additional wrinkle here in that, IIUC, the `address_space` attribute only makes sense to apply to a ptr-declarator, not a defining-type-specifier. I think the code that is causing the attribute to be rejected here isn't specific to `address_space` though -- I think it's just the general code we discuss below.

Agreed.

>> > This also means that if we want to use the proposed `annotate_type` attribute in this position, we have to use the GNU spelling.
>>
>> I don't think that's correct. We certainly support type attributes
>> using a C++ spelling already.
>
> I think this only works though if the attribute is applied to a declarator, but not to the decl specifiers? Here's an excerpt from the test in my draft change (⚙ D111548 [Clang] Add the `annotate_type` attribute):
>
> int __attribute__((annotate_type("bar"))) w;
> int [[clang::annotate_type("bar")]] w2; // expected-error {{'annotate_type' attribute cannot be applied to types}}
>
> int *__attribute__((annotate_type("bar"))) x;
> int *[[clang::annotate_type("bar")]] x2;
>
> The error seems to be coming from the following chunk of code in Parser::ParseDeclarationSpecifiers(), which is specific to C++ attributes:
>
> // Reject C++11 attributes that appertain to decl specifiers as
> // we don't support any C++11 attributes that appertain to decl
> // specifiers. This also conforms to what g++ 4.8 is doing.
> ProhibitCXX11Attributes(attrs, diag::err_attribute_not_type_attr);

Oh hey, this is the possible bug I just noted above. :smiley: I think we
should not be rejecting if the decl specifier is a defining type
specifier because that has utility,

Agree -- I'd like to add support for this in Clang.

Fantastic!

I suspect the reason this isn't supported today is because none of the existing type attributes need to be applied to the declaring-type-specifier. As noted above, `address_space` only really makes sense on ptr-declarators. And without having checked, I suspect the other attributes also refer to properties of pointers (which is where I think most of the practical reasons for attributing types comes from).

Yup, agreed.

Having said that, it would be great if the nullability attributes could, for example, be applied to a unique_ptr, but it seems that this isn't the case (Compiler Explorer).

but I'd have to go stare at the
standard a bit more to be sure. Declarator parsing is not the easiest
thing to reason about.

Let me know what you conclude.

From what I've seen in the standard (see links that you and I posted above), I think it's pretty clear where this is supposed to slot into the grammar. But I assume what you're referring to is more the problem of how to make this parse unambiguously in Clang?

Correct. I think we're both convinced that the standard allows
specifying a type attribute in this position. I've not looked into how
hard it is to correct that in our implementation -- declarator parsing
is super dense code.

>> > This is an unfortunate niggle, but teaching Clang to correctly process C++ attributes in this position seems like a non-trivial undertaking because it would require us to treat GNU spellings and C++ spellings differently in Parser::ParseSimpleDeclaration(), which seems like it would be a bit of a headache.
>>
>> I don't think we'll have to go down that route, hopefully.
>>
>> >> What kind of type semantic impacts does this attribute have on the
>> >> type identity for things like overloading, name mangling, type
>> >> conversion, etc?
>> >
>> > Good point. The intent is that the attribute should have no impact. Anything else would be pretty problematic from a portability point of view because other compilers will simply ignore the attribute -- so if it had an effect on the type system in Clang, that could cause other compilers to generate different code.
>>
>> Vendor-specific attributes are inherently not portable or safe to
>> ignore.
>
>
> But for C++ attributes at least, the standard says that compilers should do exactly this (i.e. ignore unknown attributes)?

Same for C. The standard says that vendor-specific attributes are
fully implementation-defined (so they're permitted to have whatever
semantics the implementation wants, including necessary side effects).
The standard also says that if the implementation doesn't know about
an attribute, the implementation should ignore the attribute. The
combination of these rules is what makes vendor-specific attributes
inherently nonportable or safe to ignore.

As an example: Compiler Explorer

Thanks, makes sense!

>> Generally speaking, if a type attribute has no type semantics,
>> it probably shouldn't be a type attribute. However, because this
>> attribute doesn't really have any semantics beyond "keep this
>> information associated with the type", it's a bit more complex.
>> Consider:
>>
>> void func(int i);
>> void func(int [[clang::annotate("hoo boy")]] i);
>>
>> I can see an argument that this is not a valid overload set because
>> both functions accept an integer.
>
>
> I think nullability attributes are a good analogy here (and essentially "prior art"). Here's an example:
>
> void func(int* i) {}
> void func(int* _Nonnull i) {}
>
> (Compiler Explorer)
>
> Clang doesn't consider these to be overloads and instead complains about a redefinition.

I think modeling on the nullability attributes may not be a bad idea,
but note that those are surfaced to the user as keywords and not
attributes at all. I believe that was done specifically because the
nullability attributes had type semantics, but I could be remembering
incorrectly.

Do you think you can dig out this discussion again?

Sure!

https://lists.llvm.org/pipermail/cfe-dev/2015-March/041779.html

I don't think it's really important though -- for the [[clang::annotate]] case (assuming that's what we'll go with in preference to `annotate_type`), we certainly want to specify that the annotation does not create a new type.

It's a bit weird to have a type attribute that has no type semantics,
but I think that design makes sense here for what you are trying to
accomplish. I think the best solution here is for Clang to have a
pluggable type system (https://bracha.org/pluggableTypesPosition.pdf),
but that's a huge, experimental request that's orthogonal to your RFC.

> I can certainly see how for different use cases you _would_ want an attribute to create a different type -- but I would say those would then require a second, separate attribute.
>
>>
>> However:
>>
>> void func(int i) [[clang::annotate("ABI=v1")]];
>> void func(int i) [[clang::annotate("ABI=v2")]];
>>
>> I can see an argument that this is a valid overload set because
>> something else (the linker, whatever) is using that annotated
>> information to provide additional semantics beyond what the compiler
>> cares about.
>
>
> As a data point that my be useful, I looked at what Clang does when I move those attributes to a position where they are declaration attributes (so that I can run the current Clang on them), i.e.:
>
> [[clang::annotate("ABI=v1")]] void func(int i) {}
> [[clang::annotate("ABI=v2")]] void func(int i) {}
>
> (Compiler Explorer)
>
> Again, Clang doesn't consider these to be overloads and instead complains about a redefinition.

FWIW, that's because declaration attributes do not impact the type of
the function. We do have a declaration attribute that says "this
declaration is allowed to be overloaded" in C, but that's subtly a
different thing.

> So it seems that the existing semantics of the `annotate` attribute are similar to what I'm proposing for the `annotate_type` attribute.
>
>> >> Also, one problem here is -- [[clang::annotate]] isn't just used for
>> >> the static analyzer, it's also something that can be used by backend
>> >> tools (for example, when using attribute plugins).
>> >
>> >
>> > It's good that you bring this up because I'm actually also thinking about extending ParsedAttrInfo to handle type attributes (with a new handleTypeAttribute() function as an analog to handleDeclAttribute()). The proposed annotate_type attribute would then also be a natural candidate for an attribute that the new handleTypeAttribute() could add to types.
>>
>> Oh, neat! Unifying type/stmt/decl attributes more is definitely a great goal.
>>
>> > Is this the concern that you had here or were you thinking about something different?
>>
>> Not really, it was more along the lines of lowering to LLVM IR (below)
>> so that the backend can also take advantage of this.
>>
>> >> Do we need to
>> >> consider what information should be lowered to LLVM IR when this new
>> >> attribute appears on a type?
>> >>
>> >> Thanks!
>> >
>> > My intent was that this should have no effect on code generation.
>> >
>> > Lowering annotations on declarations to LLVM IR has some obvious use cases, but for annotations on types it's less obvious how it would be useful to lower these to IR (and also less obvious to me how one would represent these). (Did you have anything specific in mind here?)
>>
>> LLVM currently has the ability to describe some types (e.g.,
>> `%struct.X = type { i32 }`), so I was envisioning attaching this
>> attribute to the type (either as an LLVM attribute or metadata, I have
>> no idea what the distinction really is between them). LLVM currently
>> supports plugins as does Clang, and Clang's support for attributes in
>> plugins still strongly encourages users to use existing attributes to
>> surface functionality, so I can imagine users wanting to use the
>> annotate type attribute in the same way they already use the annotate
>> declaration attribute for this sort of collusion. However, I don't
>> think the codegen needs to happen up front -- mostly trying to
>> understand the shape of the design still.
>
>
> Makes sense.
>
> Sorry that this has become such a long email, but I hope I've been able to give you a clearer picture of what I'm trying to achieve.

No worries, I appreciate the detailed discussion of the design!

I think we should double-check that I'm correct about the bug with
Clang accepting:

static int [] i;

but rejecting:

int [] i;

I don't think that's the distinction:

Compiler Explorer

(Noting that it accepts both of these.)

Rather, this seems to be a case of [] being treated specially (probably earlier in the parser) and therefore being accepted in this position, while named, known attributes are not accepted here.

<nods> that sounds plausible too.

If we're agreed that this is a bug, then it seems like once we fix
that bug, you could support [[clang::annotate]] in type positions
(with no type semantics) as well as declarations, but only if you were
willing to define __attribute__((annotate)) as applying to
declarations when there's an ambiguity between type vs decl. Would
that be a palatable approach for you?

Yes, I think that would definitely work for me, and there would be an attraction in not having to introduce an additional `annotate_type` attribute that otherwise works mostly the same as `annotate`.

I think making this work may not be as simple as lifting the restriction in Parser::ParseDeclarationSpecifiers() that we discussed above. I think there's another issue that I touched on previously ([cfe-dev] [RFC] New attribute `annotate_type`), namely that Clang currently lumps attributes for the declaration and the decl-specifier-seq together.

This happens in Parser::ParseSimpleDeclaration(), specifically this line
<https://github.com/llvm/llvm-project/blob/main/clang/lib/Parse/ParseDecl.cpp#L1818&gt;
:

DS.takeAttributesFrom(Attrs);

To quote the conclusion I reached at the time:

For C++ attributes, I believe this would not be correct, but that's not
(currently) a concern because AFAICT Clang currently doesn't allow C++
attributes to occur after a decl-specifier-seq at all (because, I presume,
Clang doesn't yet support any C++ attributes that can occur in this
position). This also means that if we want to use the proposed
`annotate_type` attribute in this position, we have to use the GNU
spelling. This is an unfortunate niggle, but teaching Clang to correctly
process C++ attributes in this position seems like a non-trivial
undertaking because it would require us to treat GNU spellings and C++
spellings differently in Parser::ParseSimpleDeclaration(), which seems like
it would be a bit of a headache.

And I believe this is still true?

Do you have any other ideas on what we could do about this?

I believe all of that is still true, and my only ideas boil down to
digging in and trying to solve the headaches, unfortunately. Hower, I
think that work will pay dividends in the long run beyond just
enabling [[clang::annotate]] on types; I think we'll see additional
type attributes in the future that will benefit from these fixes as
well.

~Aaron

Thanks for the discussion – I think we have a plan! I’ll get started looking at what we would need to do to support C++ attributes on a defining-type-specifier.

There’s one remaining issue that worries me a little, namely that

int __attribute((annotate(“foo”))) i;

and

int [[clang::annotate(“foo”)]] i;

will have different semantics. The GNU spelling is a declaration attribute (because it’s already possible to put it in that position today, and it gets shifted to the declaration), while the C++ spelling is a type attribute.

Is this a problem? Do we already have a precedent for this kind of thing (i.e. the semantics of an attribute changing depending on its spelling)?

Thanks for the discussion -- I think we have a plan! I'll get started looking at what we would need to do to support C++ attributes on a defining-type-specifier.

SGTM, thank you!

There's one remaining issue that worries me a little, namely that

int __attribute((annotate("foo"))) i;

and

int [[clang::annotate("foo")]] i;

will have different semantics. The GNU spelling is a declaration attribute (because it's already possible to put it in that position today, and it gets shifted to the declaration), while the C++ spelling is a type attribute.

Is this a problem? Do we already have a precedent for this kind of thing (i.e. the semantics of an attribute changing depending on its spelling)?

This is tough to answer. I think it's an ergonomic issue with
attributes (esp when the attribute is hidden behind a macro), but I
think it's one we can't do anything about -- GNU style attributes are
a bit dangerous, so you need to be careful when you use them to ensure
you know what they're applying to. So yes, I expect someone to get
tripped up by this at some point. However, short of using separate
attributes with distinct names but doing the same thing, I don't know
how to solve that. So I don't think it's a problem, but more of an
unfortunate circumstance. I suppose one thing we could consider doing
is issuing a diagnostic if parsing a GNU-style decl-or-type attribute
in a syntactic location where the attribute could reasonably apply to
either construct. We currently only have the one attribute this would
impact, but the issue seems like a general one to any known
decl-or-type attribute depending on its subjects. But I'm also not
certain it's worth the effort, so don't feel obligated to add a
diagnostic.

~Aaron

Thanks for the discussion – I think we have a plan! I’ll get started looking at what we would need to do to support C++ attributes on a defining-type-specifier.

SGTM, thank you!

There’s one remaining issue that worries me a little, namely that

int __attribute((annotate(“foo”))) i;

and

int [[clang::annotate(“foo”)]] i;

will have different semantics. The GNU spelling is a declaration attribute (because it’s already possible to put it in that position today, and it gets shifted to the declaration), while the C++ spelling is a type attribute.

Is this a problem? Do we already have a precedent for this kind of thing (i.e. the semantics of an attribute changing depending on its spelling)?

This is tough to answer. I think it’s an ergonomic issue with
attributes (esp when the attribute is hidden behind a macro), but I
think it’s one we can’t do anything about – GNU style attributes are
a bit dangerous, so you need to be careful when you use them to ensure
you know what they’re applying to. So yes, I expect someone to get
tripped up by this at some point. However, short of using separate
attributes with distinct names but doing the same thing, I don’t know
how to solve that.

Yes, this looks like the only alternative. Just wanted to explicitly revisit the question whether we think this is a strictly worse alternative?

So I don’t think it’s a problem, but more of an
unfortunate circumstance. I suppose one thing we could consider doing
is issuing a diagnostic if parsing a GNU-style decl-or-type attribute
in a syntactic location where the attribute could reasonably apply to
either construct. We currently only have the one attribute this would
impact, but the issue seems like a general one to any known
decl-or-type attribute depending on its subjects. But I’m also not
certain it’s worth the effort, so don’t feel obligated to add a
diagnostic.

A diagnostic is certainly an option. I’ll mull this over some more and will check how easy or hard it is to add.

Thanks,

Martin