"__device_builtin__" attribute ignored by clang AST matcher

Hi,

I’d like to use the clang AST matcher to match the “device_builtin” attribute (ie, “attribute((device_builtin))”) in pre-processed cuda code, but it seems that the matcher just ignores the attribute. (In clang/Sema/AttrParsedAttrKinds.inc, “AttributeCommonInfo::IgnoredAttribute” is returned upon device_builtin.)

Can anyone please provide any information on how to customize the matcher to match the ignored attribute?

Thanks,
Oliver

Hi,

I'd like to use the clang AST matcher to match the "__device_builtin__" attribute (ie, "__attribute__((device_builtin))") in pre-processed cuda code, but it seems that the matcher just ignores the attribute. (In clang/Sema/AttrParsedAttrKinds.inc, "AttributeCommonInfo::IgnoredAttribute" is returned upon __device_builtin__.)

Can anyone please provide any information on how to customize the matcher to match the ignored attribute?

I suspect you cannot match against the ignored attribute because
ignored attributes are not typically retained by the AST:

Or are you finding that the AST has the attribute but it's not
matching (perhaps because of handling __device_builtin__ vs
device_builtin differently in the AST matchers)?

~Aaron

Hi,

I'd like to use the clang AST matcher to match the "__device_builtin__" attribute (ie, "__attribute__((device_builtin))") in pre-processed cuda code, but it seems that the matcher just ignores the attribute. (In clang/Sema/AttrParsedAttrKinds.inc, "AttributeCommonInfo::IgnoredAttribute" is returned upon __device_builtin__.)

Can anyone please provide any information on how to customize the matcher to match the ignored attribute?

I suspect you cannot match against the ignored attribute because
ignored attributes are not typically retained by the AST:
Compiler Explorer

FWIW, I think it should be retained. If we are already retaining loads of type sugar nodes, which like ignored attributes have no effect on semantics, why not just make it a policy to keep a representation of all written syntax in the AST?

In fact, I would even go so far as to say macro expansions should be represented in the AST, or at least any with balanced delimiters — we could have Decl & Stmt & Expr "sugar" nodes akin to what a TypedefType is for Types. But I grant that would be a bigger step.

I believe device_builtin isn’t even retained by the AST: https://godbolt.org/z/r7Kahd. I also verified its absence using clang::Decl::attrs().

It’s interesting to see that the AST (at least in clang-11) retains both “device_builtin_surface_type” and “device_builtin_texture_type” but just not “device_builtin” in clang/Sema/AttrParsedAttrKinds.inc.

-Oliver

Yeah it confuses me a lot to see that almost all cuda attributes, such as “host”, “global”, “constant”, “device_builtin_surface_type” and “device_builtin_texture_type”, are retained by the AST, but just not “device_builtin”.

>
>>
>> Hi,
>>
>> I'd like to use the clang AST matcher to match the "__device_builtin__" attribute (ie, "__attribute__((device_builtin))") in pre-processed cuda code, but it seems that the matcher just ignores the attribute. (In clang/Sema/AttrParsedAttrKinds.inc, "AttributeCommonInfo::IgnoredAttribute" is returned upon __device_builtin__.)
>>
>> Can anyone please provide any information on how to customize the matcher to match the ignored attribute?
>
> I suspect you cannot match against the ignored attribute because
> ignored attributes are not typically retained by the AST:
> Compiler Explorer

FWIW, I think it should be retained. If we are already retaining loads of type sugar nodes, which like ignored attributes have no effect on semantics, why not just make it a policy to keep a representation of all written syntax in the AST?

I agree, up to a point. C++ and C-style attributes using [], where
appertainment is clear based on the syntactic location of the
attribute would be plausible to retain in the AST because we know what
AST node to attach the attribute to. However, for GNU-style
__attribute__ syntax, the picture is less clear because the syntactic
placement of the attribute is insufficient to determine what AST node
to attach the attribute to (some GNU attributes silently "slide" to a
"better" position based on the specific attribute).

In fact, I would even go so far as to say macro expansions should be represented in the AST, or at least any with balanced delimiters — we could have Decl & Stmt & Expr "sugar" nodes akin to what a TypedefType is for Types. But I grant that would be a bigger step.

The expanded form of the macro is what forms the AST nodes and we
retain information about whether a given source location is a result
of a macro expansion or not, so I think we already have this
information retained but without having to increase the size of the
AST to do so. But maybe I'm envisioning something differently than you
are here.

~Aaron

Yeah it confuses me a lot to see that almost all cuda attributes, such as "__host__", "__global__", "__constant__", "__device_builtin_surface_type__" and "__device_builtin_texture_type__", are retained by the AST, but just not "__device_builtin__".

That attribute came into existence as an ignored attribute in Clang:
⚙ D11690 [CUDA] Added stubs for new attributes used by CUDA headers. and it seems to have never been
updated with an implementation.

~Aaron

Hi,

I’d like to use the clang AST matcher to match the “device_builtin” attribute (ie, “attribute((device_builtin))”) in pre-processed cuda code, but it seems that the matcher just ignores the attribute. (In clang/Sema/AttrParsedAttrKinds.inc, “AttributeCommonInfo::IgnoredAttribute” is returned upon device_builtin.)

Can anyone please provide any information on how to customize the matcher to match the ignored attribute?

I suspect you cannot match against the ignored attribute because
ignored attributes are not typically retained by the AST:
https://godbolt.org/z/4v574o

Oliver, perhaps something like the following will work for the time being:

#define ATTRIBUTE_PLUS_ANNOTATION(x) attribute((x)) attribute((annotate(#x)))

Usage: ATTRIBUTE_PLUS_ANNOTATION(device_builtin)

Then match based on the annotation attribute string, since that is never ignored. That should be a general purpose solution to handle any potentially-ignored attributes, for now.

FWIW, I think it should be retained. If we are already retaining loads of type sugar nodes, which like ignored attributes have no effect on semantics, why not just make it a policy to keep a representation of all written syntax in the AST?

I agree, up to a point. C++ and C-style attributes using [], where
appertainment is clear based on the syntactic location of the
attribute would be plausible to retain in the AST because we know what
AST node to attach the attribute to. However, for GNU-style
attribute syntax, the picture is less clear because the syntactic
placement of the attribute is insufficient to determine what AST node
to attach the attribute to (some GNU attributes silently “slide” to a
“better” position based on the specific attribute).

That is tricky to handle, but in any such case I think we should be explicitly representing whatever we are doing in the AST: in that case, retain a “ghost” attribute attached to whatever declaration the user seemed to intend to attach it to syntactically, which then perhaps contains a pointer to the declaration to which the attribute was moved.

More generally: the clang AST should not only be measured by how many of its tests pass, but also a) how easily users can access the information it stores and b) how often users want to access valid information that is not stored at all. The latter is the more urgent issue: when written information isn’t stored at all, anywhere, it puts users in a real bind; their only option is to modify their source code, i.e. reduce its clarity, just to be able to reason about it in AST tools, a la my suggestion above. Too much of that and you start to lose the war on complexity. Better to stray on the side of retaining information during parsing and transformation.

In fact, I would even go so far as to say macro expansions should be represented in the AST, or at least any with balanced delimiters — we could have Decl & Stmt & Expr “sugar” nodes akin to what a TypedefType is for Types. But I grant that would be a bigger step.

The expanded form of the macro is what forms the AST nodes and we
retain information about whether a given source location is a result
of a macro expansion or not, so I think we already have this
information retained but without having to increase the size of the
AST to do so. But maybe I’m envisioning something differently than you
are here.

I don’t want to get too far off topic but I am envisioning something different, e.g. a MacroExpansionDecl which is semantically worthless but contains information about a) the called macro name and arguments and b) pointers to sibling Decls it expanded into. Then have the same for Stmts, and Exprs (a bit different). I know you can get information about the macro expansion from the SourceLocation but it seems tricky, and it takes some work to figure out all the e.g. declarations a macro expanded into, so it would be nice to have a more straightforward representation in the AST, at least for the common case of macro expansions containing balanced delimiters.

But I don’t want to get away from Oliver’s attribute needs, so we can leave it as something to reflect on for the future.

Thanks David! Nice suggestion on using the annotation attribute! I’ll go with it. :slight_smile:

Best,
Oliver