As some of you may have heard, Swift has gone open-source over at swift.org. Swift makes heavy use of Clang for its (Objective-)C interoperability, including loading Clang modules to map (Objective-)C APIs into Swift via Swift’s “Clang importer” and using Clang’s CodeGen to handle C ABI issues (record layout, calling conventions) and use C inline functions directly from Swift [*].
That said, Swift’s clone of the Clang repository does have some content that isn’t in the llvm.org Clang repository. Here’s a quick summary of what that content is:
There are several new attributes. We plan to propose these for inclusion into mainline Clang. They’re fairly small additions, some of which have wider applicability than Swift support:
‘noescape’ attribute: indicates that the address provided by a particular function parameter of pointer/reference type won’t escape the function. At present, this is only used to map to Swift’s ‘noescape’ attribute, although we think it makes sense to use this for the LLVM IR “nocapture” parameter attribute as well.
‘objc_subclassing_restricted’ attribute: indicates that a particular Objective-C class cannot be subclassed. Swift uses it in its generated Objective-C headers, but we are interested in making this a first-class Objective-C feature.
Swift-specific attributes (‘swift_error’, ‘swift_name’, ’swift_private’): these attributes affect the mapping of (Objective-)C declarations into Swift.
‘swift’ unavailability: the existing ‘availability’ attribute is extended with a ‘swift’ platform, so that one can mark something as unavailable in Swift.
API Notes: This represents the bulk of the changes in the repository. API notes solve a not-uncommon problem: we invent some new Clang attribute that would be beneficial to add to some declarations in system headers (e.g., adding a ‘noreturn’ attribute to the C ‘exit’ function), but we can’t go around and fix all of the system headers everywhere. With API notes, we can write a separate YAML file that states that we want to add ‘noreturn’ to the ‘exit’ function: when we feed that YAML file into Clang as part of normal compilation (via a command-line option), Clang will add ‘noreturn’ to the ‘exit’ function when it parses the declaration of ‘exit’. Personally, I don’t like API notes—even with our optimizations, it’s inefficient in compile time and it takes the “truth” out of the headers—but I can see the wider use cases. If the Clang community wants this feature, I can prepare a proper proposal; if not, we’ll keep this code in the Swift clone of Clang and delete it if Swift ever stops needing it.
SourceMgrAdapter: An adapter that translates diagnostics from an llvm::SourceMgr to clang::SourceManager. This is used by the API notes YAML compiler to translate its diagnostics into something that goes our through Clang’s SourceManager, but might be useful for other clients that are making use of llvm::SourceMgr for simple handling of source files. Unless API notes gets pulled into llvm.org Clang or someone else asks for it, I don’t feel like this is important to pull into llvm.org Clang by itself.
Any questions? Feel free to contact me!
Cheers,
Doug
[*] The actual ideas were discussed at the 2014 Developer Meeting in the “Skip the FFI” talk by Jordan Rose and John McCall (http://llvm.org/devmtg/2014-10/#talk18)
As some of you may have heard, Swift has gone open-source over at
swift.org. Swift makes heavy use of Clang for its (Objective-)C
interoperability, including loading Clang modules to map (Objective-)C APIs
into Swift via Swift’s “Clang importer” and using Clang’s CodeGen to handle
C ABI issues (record layout, calling conventions) and use C inline
functions directly from Swift [*].
As an out-of-tree language front-end dependent on Clang, we have a clone
of the llvm.org Clang repository over on GitHub at
github.com/apple/swift-clang. We merge regularly and try to minimize our
differences with llvm.org's Clang—for more information on how we’re
handling this, see swift.org/contributing/#llvm-and-swift.
That said, Swift’s clone of the Clang repository does have some content
that isn’t in the llvm.org Clang repository. Here’s a quick summary of
what that content is:
* There are several new attributes. We plan to propose these for inclusion
into mainline Clang. They’re fairly small additions, some of which have
wider applicability than Swift support:
* ‘noescape’ attribute: indicates that the address provided by a
particular function parameter of pointer/reference type won’t escape the
function. At present, this is only used to map to Swift’s ‘noescape’
attribute, although we think it makes sense to use this for the LLVM IR
“nocapture” parameter attribute as well.
* ‘objc_subclassing_restricted’ attribute: indicates that a particular
Objective-C class cannot be subclassed. Swift uses it in its generated
Objective-C headers, but we are interested in making this a first-class
Objective-C feature.
* Swift-specific attributes (‘swift_error', ‘swift_name’,
’swift_private'): these attributes affect the mapping of (Objective-)C
declarations into Swift.
* ‘swift’ unavailability: the existing ‘availability’ attribute is
extended with a ‘swift’ platform, so that one can mark something as
unavailable in Swift.
* API Notes: This represents the bulk of the changes in the repository.
API notes solve a not-uncommon problem: we invent some new Clang attribute
that would be beneficial to add to some declarations in system headers
(e.g., adding a ‘noreturn’ attribute to the C ‘exit’ function), but we
can’t go around and fix all of the system headers everywhere. With API
notes, we can write a separate YAML file that states that we want to add
‘noreturn’ to the ‘exit’ function: when we feed that YAML file into Clang
as part of normal compilation (via a command-line option), Clang will add
‘noreturn’ to the ‘exit’ function when it parses the declaration of ‘exit’.
Personally, I don’t like API notes—even with our optimizations, it’s
inefficient in compile time and it takes the “truth” out of the headers—but
I can see the wider use cases. If the Clang community wants this feature, I
can prepare a proper proposal; if not, we’ll keep this code in the Swift
clone of Clang and delete it if Swift ever stops needing it.
Internally I recently saw a situation that would have benefitted from this
sort of thing. Essentially Sean Eveson (CC'd) and his coworkers were
prototyping some static analyzer checks that required certain functions in
the SDK to be marked up with some info (really not much more than the moral
equivalent of an "printf" attribute). Obviously there's a catch-22 with
proving the checks are valuable and getting the corresponding API's
officially marked up with such attributes (and such updated headers making
to all clients, etc.).
This sort of feature would help break that catch-22 and avoid the need for
ad-hoc hardcoded tables. Having all that PS4-specific data hardcoded was
actually the primary barrier to upstreaming the check (or at least getting
a proper upstream review of the idea), so having a way to decouple those
private annotations would have really been nice!
API Notes is great for adding annotations to declarations when/before changes to header files can be made. I can see how this could be used by the future clang static analyzer checks.
If people need a feature like this, I’m wondering if it would make sense to parse a special file containing normalish C/C++/ObjC declarations, and use those to replace or augment attributes from matching declarations in the normal parse. Requiring a special YAML and binary format reader/writer seems unfortunate, especially as it looks nowhere close to being generically useful, beyond exactly what Swift needed.
I must also admit to being a bit surprised that Apple needed to ship a feature like this at all, since I’d have thought you/they would have had full control over their own platform’s header files, and could’ve just committed the changes to that directly.
If people need a feature like this, I’m wondering if it would make sense to parse a special file containing normalish C/C++/ObjC declarations, and use those to replace or augment attributes from matching declarations in the normal parse. Requiring a special YAML and binary format reader/writer seems unfortunate, especially as it looks nowhere close to being generically useful, beyond exactly what Swift needed.
The use cases described by Sean and me have nothing to do with Swift. The static analyzer (as well as other bug finding tools) often need to have more information than what’s already in the code. The clang static analyzer already has a growing body of hardcoded APIs.
Sorry I should have written more clearly. I meant that looking at the current implementation of API Notes, the YAML/binary formats look very much specific to supporting only those attributes needed by swift, and do not look generically usable. I can certainly see the concept in general being generically useful.
If people need a feature like this, I'm wondering if it would make sense
to parse a special file containing normalish C/C++/ObjC declarations, and
use those to replace or augment attributes from matching declarations in
the normal parse.
That's actually a really good idea.
Yesterday we were talking at the social briefly about this and actually it
became pretty clear that a YAML-type format wouldn't scale. A compelling
example from the internal patch I was reviewing is something like "if this
argument has this bit set, then treat this other argument as having this
meaning to the static analyzer"*; the bit itself is of course specified in
the source by `#define FOO_MAGIC_OPTION (1 << 7)` or `enum { ...,
FOO_MAGIC_OPTION = (1 << 7), ...}` or whatever, and to maintain sanity the
YAML file would of course want to use that symbolic constant. Realistically
C/C++ source code is the only fully general way to get the right name
lookup, macro expansion, constexpr evaluation, ...
So instead of a separate file, it could be something like, your side-table
of attributes is really just a C/C++ source file that contains "replacement
declarations" (marked with a pragma or attribute or something), that the
analyzer would parse first then substitute as appropriate (or something).
That would allow you to use the natural API `#include`'s to pick up the
symbolic constants.
* (
For reference, what would ultimately be desired is something like:
enum {
...
FOO_MAGIC_BIT = (1 << 7),
...
};
// If (flags & FOO_MAGIC_BIT), then the analyzer should treat `p` as having
some special meaning.
void foo(Bar *p, unsigned flags)
__attribute__((arg_is_magical_when_bit_is_set(0/*arg `p`*/,
1/*arg`flags`*/, FOO_MAGIC_BIT)));
)
There are a few things that I’ve wanted to be able to put in some external store for language bridges (ownership semantics for pointers, array lengths). It’s far easier to provide external metadata than to modify a third-party library’s sources. Having some generic mechanism for doing this would be very useful.
More generally, I wonder if anyone on the Swift team would be interested in providing a stable external API for the bits of Clang’s CodeGen that you use? This is something that would be very valuable to other languages.
As some of you may have heard, Swift has gone open-source over at swift.org <http://swift.org/>\. Swift makes heavy use of Clang for its (Objective-)C interoperability, including loading Clang modules to map (Objective-)C APIs into Swift via Swift’s “Clang importer” and using Clang’s CodeGen to handle C ABI issues (record layout, calling conventions) and use C inline functions directly from Swift [*].
As an out-of-tree language front-end dependent on Clang, we have a clone of the llvm.org <http://llvm.org/> Clang repository over on GitHub at github.com/apple/swift-clang <http://github.com/apple/swift-clang>\. We merge regularly and try to minimize our differences with llvm.org <http://llvm.org/>'s Clang—for more information on how we’re handling this, see swift.org/contributing/#llvm-and-swift <Swift.org - Contributing.
That said, Swift’s clone of the Clang repository does have some content that isn’t in the llvm.org <http://llvm.org/> Clang repository. Here’s a quick summary of what that content is:
* There are several new attributes. We plan to propose these for inclusion into mainline Clang. They’re fairly small additions, some of which have wider applicability than Swift support:
* ‘noescape’ attribute: indicates that the address provided by a particular function parameter of pointer/reference type won’t escape the function. At present, this is only used to map to Swift’s ‘noescape’ attribute, although we think it makes sense to use this for the LLVM IR “nocapture” parameter attribute as well.
* ‘objc_subclassing_restricted’ attribute: indicates that a particular Objective-C class cannot be subclassed. Swift uses it in its generated Objective-C headers, but we are interested in making this a first-class Objective-C feature.
* Swift-specific attributes (‘swift_error', ‘swift_name’, ’swift_private'): these attributes affect the mapping of (Objective-)C declarations into Swift.
* ‘swift’ unavailability: the existing ‘availability’ attribute is extended with a ‘swift’ platform, so that one can mark something as unavailable in Swift.
* API Notes: This represents the bulk of the changes in the repository. API notes solve a not-uncommon problem: we invent some new Clang attribute that would be beneficial to add to some declarations in system headers (e.g., adding a ‘noreturn’ attribute to the C ‘exit’ function), but we can’t go around and fix all of the system headers everywhere. With API notes, we can write a separate YAML file that states that we want to add ‘noreturn’ to the ‘exit’ function: when we feed that YAML file into Clang as part of normal compilation (via a command-line option), Clang will add ‘noreturn’ to the ‘exit’ function when it parses the declaration of ‘exit’. Personally, I don’t like API notes—even with our optimizations, it’s inefficient in compile time and it takes the “truth” out of the headers—but I can see the wider use cases. If the Clang community wants this feature, I can prepare a proper proposal; if not, we’ll keep this code in the Swift
c
lone of Clang and delete it if Swift ever stops needing it.
Internally I recently saw a situation that would have benefitted from this sort of thing. Essentially Sean Eveson (CC'd) and his coworkers were prototyping some static analyzer checks that required certain functions in the SDK to be marked up with some info (really not much more than the moral equivalent of an "printf" attribute). Obviously there's a catch-22 with proving the checks are valuable and getting the corresponding API's officially marked up with such attributes (and such updated headers making to all clients, etc.).
This sort of feature would help break that catch-22 and avoid the need for ad-hoc hardcoded tables. Having all that PS4-specific data hardcoded was actually the primary barrier to upstreaming the check (or at least getting a proper upstream review of the idea), so having a way to decouple those private annotations would have really been nice!
+1
API Notes is great for adding annotations to declarations when/before changes to header files can be made. I can see how this could be used by the future clang static analyzer checks.
Will this also support parameter annotations? Otherwise the feature is too crippled.
On the other hand supporting parameters will cause issues with header/notes compatibility (imagine function interface have been changed at some point and YAML only knows about the last version).
As some of you may have heard, Swift has gone open-source over at swift.org <http://swift.org/>\. Swift makes heavy use of Clang for its (Objective-)C interoperability, including loading Clang modules to map (Objective-)C APIs into Swift via Swift’s “Clang importer” and using Clang’s CodeGen to handle C ABI issues (record layout, calling conventions) and use C inline functions directly from Swift [*].
As an out-of-tree language front-end dependent on Clang, we have a clone of the llvm.org <http://llvm.org/> Clang repository over on GitHub at github.com/apple/swift-clang <http://github.com/apple/swift-clang>\. We merge regularly and try to minimize our differences with llvm.org <http://llvm.org/>'s Clang—for more information on how we’re handling this, see swift.org/contributing/#llvm-and-swift <Swift.org - Contributing.
That said, Swift’s clone of the Clang repository does have some content that isn’t in the llvm.org <http://llvm.org/> Clang repository. Here’s a quick summary of what that content is:
* There are several new attributes. We plan to propose these for inclusion into mainline Clang. They’re fairly small additions, some of which have wider applicability than Swift support:
* ‘noescape’ attribute: indicates that the address provided by a particular function parameter of pointer/reference type won’t escape the function. At present, this is only used to map to Swift’s ‘noescape’ attribute, although we think it makes sense to use this for the LLVM IR “nocapture” parameter attribute as well.
* ‘objc_subclassing_restricted’ attribute: indicates that a particular Objective-C class cannot be subclassed. Swift uses it in its generated Objective-C headers, but we are interested in making this a first-class Objective-C feature.
* Swift-specific attributes (‘swift_error', ‘swift_name’, ’swift_private'): these attributes affect the mapping of (Objective-)C declarations into Swift.
* ‘swift’ unavailability: the existing ‘availability’ attribute is extended with a ‘swift’ platform, so that one can mark something as unavailable in Swift.
* API Notes: This represents the bulk of the changes in the repository. API notes solve a not-uncommon problem: we invent some new Clang attribute that would be beneficial to add to some declarations in system headers (e.g., adding a ‘noreturn’ attribute to the C ‘exit’ function), but we can’t go around and fix all of the system headers everywhere. With API notes, we can write a separate YAML file that states that we want to add ‘noreturn’ to the ‘exit’ function: when we feed that YAML file into Clang as part of normal compilation (via a command-line option), Clang will add ‘noreturn’ to the ‘exit’ function when it parses the declaration of ‘exit’. Personally, I don’t like API notes—even with our optimizations, it’s inefficient in compile time and it takes the “truth” out of the headers—but I can see the wider use cases. If the Clang community wants this feature, I can prepare a proper proposal; if not, we’ll keep this code in the Swift
c
lone of Clang and delete it if Swift ever stops needing it.
Internally I recently saw a situation that would have benefitted from this sort of thing. Essentially Sean Eveson (CC'd) and his coworkers were prototyping some static analyzer checks that required certain functions in the SDK to be marked up with some info (really not much more than the moral equivalent of an "printf" attribute). Obviously there's a catch-22 with proving the checks are valuable and getting the corresponding API's officially marked up with such attributes (and such updated headers making to all clients, etc.).
This sort of feature would help break that catch-22 and avoid the need for ad-hoc hardcoded tables. Having all that PS4-specific data hardcoded was actually the primary barrier to upstreaming the check (or at least getting a proper upstream review of the idea), so having a way to decouple those private annotations would have really been nice!
+1
API Notes is great for adding annotations to declarations when/before changes to header files can be made. I can see how this could be used by the future clang static analyzer checks.
Will this also support parameter annotations? Otherwise the feature is too crippled.
On the other hand supporting parameters will cause issues with header/notes compatibility (imagine function interface have been changed at some point and YAML only knows about the last version).
Yes, parameter annotations are supported.
Currently, the format is only targeting a specific list of annotations such as nullability. However, it can be extended or generalized to cover more:
Good point, the format is limited to s small set of annotations and will need to be extended. Although, we do have a static analyzer checker for nullability, which the primary annotation it supports.
Adding a file containing C/C++/ObjC declarations with additional attributes would be more flexible in some respects. However, this approach has many potential downsides.
Since there was no specific proposal on how this approach should be implemented, suppose we wanted to extend the experimental ModelInjector (http://reviews.llvm.org/D13731). ModelInjector was added to the analyzer so that we could feed it summaries of common functions, which are not a part of a TU otherwise. The models are injected very late, during static analysis, so other clients would not be able to use that approach as it stands right now.
Here are some of the problems with adding augmented C/C++/ObjC declarations:
Compiling the extra declarations might lead to unpredictable errors depending on the code the compiled project contains.
Common APIs often have slightly different declarations on different platforms. We could address those with ifdefs; that would be messy. Naming the “thing” in the APINotes file is much simpler.
In order to make this usable by other clients, the injection has to be moved up the compilation pipeline. Most likely, we would need to be able to tell which header the declaration comes from as well as associate it with the corresponding model file. This approach would muddle the compilation process much more and will be more difficult to maintain.
What about the distribution model? Suppose we need to use this method to suppress false positives in user code by annotating 3d party APIs. Using the annotations in code approach, we would be redistributing those APIs.
The binary YAML format is more succinct, which might matter more or less depending on how many supplementary annotations one has.
As Anna has kindly pointed out, APINotes is a possible replacement for my patch proposal at http://reviews.llvm.org/D13731, i.e. to provide the static analyzer with “fake” function definitions with extra attributes that get merged into the AST at the beginning of static analysis (similar idea to that described earlier by Sean S and James YK).
As APINotes sounds like a good solution, I would need to make sure this fits our requirements:
We need to support new attributes that make sense for static analysis, not just the ones currently used by the static analyzer.
We need allow 3rd parties to easily create their own APINotes files.
We need to support C++.
We need to support an unlimited number of functions.
To have the flexibility of organizing one/many libraries into one or many APINotes files.
To have a corresponding attribute to each APINotes feature.
To have the capacity to support conditional attributes (as Sean S described earlier).
Do those requirement fit with the APINotes design?
Regards,
Pierre Gousseau
SN-Systems - Sony Computer Entertainment
As Anna has kindly pointed out, APINotes is a possible replacement for my patch proposal at http://reviews.llvm.org/D13731, i.e. to provide the static analyzer with “fake” function definitions with extra attributes that get merged into the AST at the beginning of static analysis (similar idea to that described earlier by Sean S and James YK).
As APINotes sounds like a good solution, I would need to make sure this fits our requirements:
We need to support new attributes that make sense for static analysis, not just the ones currently used by the static analyzer.
We need allow 3rd parties to easily create their own APINotes files.
It should be quite easy to have 3d parties implement tools to generate the YAML files.
We need to support C++.
We need to support an unlimited number of functions.
To have the flexibility of organizing one/many libraries into one or many APINotes files.
To have a corresponding attribute to each APINotes feature.
To have the capacity to support conditional attributes (as Sean S described earlier).
I am not sure what exactly “conditional attributes” refers to. However, APINotes do not support using macros inside attributes.
Do those requirement fit with the APINotes design?
All of the above (except for # 7) are in agreement with the design of APINotes. However, the APINotes (nor the ModelInjector) support all of these right now.
The major pieces that are missing from APINotes are support for new attributes and C++ support. It is possible to implement a generic support for any new attributes, for example, by adding an attribute field that could be checked and parsed as code by the compiler. For a simplified and much less generic solution, we could design an attribute for static analysis (as I’ve described in Phabricator) and only extend the APINotes format to support that. APINotes currently supports ObjC and C and would need to be extended to handle C++, which is not trivial. (ModelInjector only supports C and will face the same issue.)
As I’ve mentioned in the previous email, there are many reasons to prefer APINotes. Also, we’ve used them in production on a very large scale.