[XRay] RFC: Adding -fxray-{always, never}-instrument=... to Clang

TL;DR: Adding the [[clang::xray_{always,never}_instrument]] attribute in the declaration/definition of a function is cumbersome for guaranteed instrumentation (or non-instrumentation) of specific functions. We'd like to make the imbuing of this particular attribute a command-line controllable option.

Background

Just as an update, I have a patch under review for this functionality: https://reviews.llvm.org/D30388

Thoughts would be most appreciated.

Cheers

This sounds totally reasonable.

I have vague memories that there was some demand for a generic way to slap attributes on decls without changing the source code, but I don’t remember who wanted that. I thought it had something to do with static analysis, so maybe Jordan or Anna know?

It feels premature to me to try to build out that functionality when we already have special case lists implemented for such a similar use case (sanitizer blacklists), so I’m in favor of moving forward with those.

This sounds totally reasonable.

I have vague memories that there was some demand for a generic way to slap
attributes on decls without changing the source code, but I don't remember
who wanted that. I thought it had something to do with static analysis, so
maybe Jordan or Anna know?

Correct. We have implemented such system for clang. We call it API Notes.
It currently lives in the out-of-tree clang used by the Swift compiler (
http://github.com/apple/swift-clang). It is not complete, for example, it
lacks C++ support, but we've been using it in production for several years
now.

Doug has sent out an email about this and other out-of-tree changes trying
to figure out if the community has interest in these additions. Here is the
thread, it mainly discusses API Notes:

http://lists.llvm.org/pipermail/cfe-dev/2015-December/046335.html

Cheers,
Anna

Thanks Reid!

I agree, I think we can build upon this later on in case we want a more generic utility for imbuing attributes through a side channel.

-- Dean

Thanks for the pointer Anna -- I'll go have a read about this.

Maybe later on we can generalise based on experience from swift to inform the design/use-cases if this will ever be done in clang.

Cheers

-- Dean

>
>
>
> This sounds totally reasonable.
>
> I have vague memories that there was some demand for a generic way to
slap attributes on decls without changing the source code, but I don't
remember who wanted that. I thought it had something to do with static
analysis, so maybe Jordan or Anna know?
>
> Correct. We have implemented such system for clang. We call it API
Notes. It currently lives in the out-of-tree clang used by the Swift
compiler (http://github.com/apple/swift-clang). It is not complete, for
example, it lacks C++ support, but we've been using it in production for
several years now.
>
> Doug has sent out an email about this and other out-of-tree changes
trying to figure out if the community has interest in these additions. Here
is the thread, it mainly discusses API Notes:
>
> http://lists.llvm.org/pipermail/cfe-dev/2015-December/046335.html

Thanks for the pointer Anna -- I'll go have a read about this.

Maybe later on we can generalise based on experience from swift to inform
the design/use-cases if this will ever be done in clang.

Just to be 100% clear, API Notes is a *clang* *feature* it's used by the
Swift project to support interoperability between Swift and C/ObjC. (For
example, to store knowledge about how C APIs should be imported into
Swift.) The Swift project uses a clone of llvm/clang. The clone is
auto-synced with top-of-tree llvm/clang but contains several features that
are currently only used by the Swift project.

Here is the description of API Notes from Doug's email:

API notes solve a not-uncommon problem: we invent some new Clang
attribute that would be beneficial to add to some declarations in
system headers (e.g., adding a ‘noreturn’ attribute to the C ‘exit’
function), but we can’t go around and fix all of the system headers
everywhere. With API notes, we can write a separate YAML file that
states that we want to add ‘noreturn’ to the ‘exit’ function: when we
feed that YAML file into Clang as part of normal compilation (via a
command-line option), Clang will add ‘noreturn’ to the ‘exit’ function
when it parses the declaration of ‘exit’. Personally, I don’t like API
notes—even with our optimizations, it’s inefficient in compile time
and it takes the “truth” out of the headers—but I can see the wider
use cases. If the Clang community wants this feature, I can prepare a
proper proposal; if not, we’ll keep this code in the Swift clone of
Clang and delete it if Swift ever stops needing it.

I think API notes can be used by other clients in clang, such a the
static analyzer. It seems that it would be directly applicable to the
scenario you describe as well. If so, I would propose to merge API
Notes into mainline clang.

Even though Doug mentioned (in Dec 2015) that the Swift clone might
delete this functionality in the future if API Notes are not needed
any more, in practice, we see that the Swift project use cases for API
Notes are expanding not shrinking.

Cheers

Just to be 100% clear, API Notes is a clang feature it's used by the Swift project to support interoperability between Swift and C/ObjC. (For example, to store knowledge about how C APIs should be imported into Swift.) The Swift project uses a clone of llvm/clang. The clone is auto-synced with top-of-tree llvm/clang but contains several features that are currently only used by the Swift project.

I think this is what I meant -- if they're in the clone of clang used in the Swift project, if (or when?) it makes it back upstream we can think about maybe also using it. I think it's very similar to what the sanitisers and XRay need for imbuing attributes with the special case list, without having to use YAML or something more structured than the simple text files.

Here is the description of API Notes from Doug's email:
API notes solve a not-uncommon problem: we invent some new Clang attribute that would be beneficial to add to some declarations in system headers (e.g., adding a ‘noreturn’ attribute to the C ‘exit’ function), but we can’t go around and fix all of the system headers everywhere. With API notes, we can write a separate YAML file that states that we want to add ‘noreturn’ to the ‘exit’ function: when we feed that YAML file into Clang as part of normal compilation (via a command-line option), Clang will add ‘noreturn’ to the ‘exit’ function when it parses the declaration of ‘exit’. Personally, I don’t like API notes—even with our optimizations, it’s inefficient in compile time and it takes the “truth” out of the headers—but I can see the wider use cases. If the Clang community wants this feature, I can prepare a proper proposal; if not, we’ll keep this code in the Swift clone of Clang and delete it if Swift ever stops needing it.
I think API notes can be used by other clients in clang, such a the static analyzer. It seems that it would be directly applicable to the scenario you describe as well. If so, I would propose to merge API Notes into mainline clang.

Even though Doug mentioned (in Dec 2015) that the Swift clone might delete this functionality in the future if API Notes are not needed any more, in practice, we see that the Swift project use cases for API Notes are expanding not shrinking.

That's encouraging!

I could also imagine a means of doing a pass at the LLVM level that might be used to imbue attributes based on information gained in other passes (or even externally). One thing that some users of XRay have asked is whether it's possible to statically do a walk of the call graph and selectively do instrumentation -- i.e. remove instrumentation from other parts of the binary and only focus on the code paths that we can prove statically will be calling a specific set of functions. This could work just on the LLVM level, but would also work on the front-end in cases where the choice between "always" and "never" instrument are being made at that level.

Anyway, API Notes sounds like an interesting approach for the generic case. I'm sure other projects would find interesting uses of this when it makes it in clang proper. :slight_smile:

Cheers

-- Dean

>
> Just to be 100% clear, API Notes is a clang feature it's used by the
Swift project to support interoperability between Swift and C/ObjC. (For
example, to store knowledge about how C APIs should be imported into
Swift.) The Swift project uses a clone of llvm/clang. The clone is
auto-synced with top-of-tree llvm/clang but contains several features that
are currently only used by the Swift project.
>

I think this is what I meant -- if they're in the clone of clang used in
the Swift project, if (or when?) it makes it back upstream we can think
about maybe also using it. I think it's very similar to what the sanitisers
and XRay need for imbuing attributes with the special case list, without
having to use YAML or something more structured than the simple text files.

> Here is the description of API Notes from Doug's email:
> API notes solve a not-uncommon problem: we invent some new Clang
attribute that would be beneficial to add to some declarations in system
headers (e.g., adding a ‘noreturn’ attribute to the C ‘exit’ function), but
we can’t go around and fix all of the system headers everywhere. With API
notes, we can write a separate YAML file that states that we want to add
‘noreturn’ to the ‘exit’ function: when we feed that YAML file into Clang
as part of normal compilation (via a command-line option), Clang will add
‘noreturn’ to the ‘exit’ function when it parses the declaration of ‘exit’.
Personally, I don’t like API notes—even with our optimizations, it’s
inefficient in compile time and it takes the “truth” out of the headers—but
I can see the wider use cases. If the Clang community wants this feature, I
can prepare a proper proposal; if not, we’ll keep this code in the Swift
clone of Clang and delete it if Swift ever stops needing it.
> I think API notes can be used by other clients in clang, such a the
static analyzer. It seems that it would be directly applicable to the
scenario you describe as well. If so, I would propose to merge API Notes
into mainline clang.
>
> Even though Doug mentioned (in Dec 2015) that the Swift clone might
delete this functionality in the future if API Notes are not needed any
more, in practice, we see that the Swift project use cases for API Notes
are expanding not shrinking.

That's encouraging!

I could also imagine a means of doing a pass at the LLVM level that might
be used to imbue attributes based on information gained in other passes (or
even externally). One thing that some users of XRay have asked is whether
it's possible to statically do a walk of the call graph and selectively do
instrumentation -- i.e. remove instrumentation from other parts of the
binary and only focus on the code paths that we can prove statically will
be calling a specific set of functions. This could work just on the LLVM
level, but would also work on the front-end in cases where the choice
between "always" and "never" instrument are being made at that level.

This sounds very useful! For example, it could allow sanitizers or fuzzing
to focus on newly updated code or code viewed as important for other
reasons.

Anyway, API Notes sounds like an interesting approach for the generic
case. I'm sure other projects would find interesting uses of this when it
makes it in clang proper. :slight_smile:

If/When you think API Notes will be useful for your cause, let us know. We
can upstream them at any time. One of the main reasons they are not
upstreamed already is that there are no users who are ready to immediately
utilize them. (The static analyzer could definitely benefit from them, but
no-one is actively working on using external annotations in the analyzer
right now.)

Cheers