I wanted to revive the discussion on adding API notes feature to Clang.
The description of the feature from the original discussion
“API notes solve a not-uncommon problem: we invent some new Clang attribute that would be beneficial to add to some declarations in system headers (e.g., adding a ‘noreturn’ attribute to the C ‘exit’ function), but we can’t go around and fix all of the system headers everywhere. With API notes, we can write a separate YAML file that states that we want to add ‘noreturn’ to the ‘exit’ function: when we feed that YAML file into Clang as part of normal compilation (via a command-line option), Clang will add ‘noreturn’ to the ‘exit’ function when it parses the declaration of ‘exit’.”
The old discussion can be found here: http://lists.llvm.org/pipermail/cfe-dev/2015-December/046335.html
The summary of the discussion
It would be useful for the static analyzer to attach additional info to the functions
There were already other trials to get similar feature working, see https://reviews.llvm.org/D13731
Anna has a nice summary what is the problem with augmented declarations: http://lists.llvm.org/pipermail/cfe-dev/2015-December/046378.html
C++ support is requested by the community, this is missing right now.
Wider range of annotation support is missing. It is also requested by the community.
Parameter annotations are supported
One of the concerns is the performance
The case for adding API Notes
Importing annotations from external source looks to be an interesting feature for the community
Sean presented some API checkers which works based on special annotations: http://llvm.org/devmtg/2017-03//2017/02/20/accepted-sessions.html#29
Similar feature can be used to provide fake implementations for functions. It could help to finish a half done feature of the Static Analyzer that was implemented during a GSoC project: https://www.google-melange.com/archive/gsoc/2014/orgs/llvm/projects/xazax.html
Performance might be crucial for regular compilation, but less of a problem for the Clang Static Analyzer which tends to be slower than compilations, so it is less likely to be bottlenecked by this phase
It would also be possible to import sanitizer/optimizer/codegen related annotations
Commit the feature as is
Extend it with C++ support (namespaces, overloading, templates…)
Extend it with additional annotations, and attaching custom data (like fake function bodies)
What do you think? What are the main concerns with this feature?
I’m interested in having something like API notes available in Clang, definitely. I have a few questions though:
- Is YAML absolutely necessary?
It’s not necessarily, but it’s convenience. Swift relies heavily on the YAML form; it’s been deployed for quite a while, so if API notes eventually take some other form, we’ll still need a translation from YAML to that new form for backward compatibility.
What I think I don’t understand is what the actual data structure is that it will represent.
Here’s an example of API notes as used for Swift:
Is it a configuration for something that will actually imbue the attributes into the IR?
The entries in API notes are typically mapped to attributes in the Clang AST at the point where the corresponding entity is parsed. Some will affect warnings, some will affect the generated IR, etc.
Or is it a dump of an in-memory data structure, that specifies a side-table of attributes for the generated IR?
It’s not an in-memory data structure. It’s just a format that can describe declarations and apply additional information to them.
- What is the overlap with the sanitiser special case list, which provides regular expressions and a simple configuration language? Can one be ported to be built on top of the other? Is there a roadmap/plan for unifying that features?
The sanitizer/analyzer special case would involve adding more fields into the YAML, which then map to whatever internal data structure the sanitizer/analyzer needs.
- What is the programmatic API for imbuing the attributes before IR generation? Say I’m using clang as a library, is there a data structure (in-memory) that I can programmatically define so that the attributes are applied to the IR as a transformation? If so, should this be part of the LLVM back-end instead? Or should it be a stand-alone library that clang uses in the static analysis phase? I can certainly imagine this being useful in LTO or PGO, not necessarily tied to Clang.
Clang’s Sema is responsible for mapping API notes to the AST; see https://github.com/apple/swift-clang/blob/stable/lib/Sema/SemaAPINotes.cpp for the actual implementation. This is the Right Answer, because it adds attributes at the same point they would get added if the annotations were in the source code itself, so all downstream clients see the effects of the attributes.
To make it more custom, what you really need is a generalized attribute mechanism in Clang itself that makes it easy to, e.g., add your own attribute (programmatically?), map those attributes down to IR attributes, and have API notes be general enough to support those attributes as well.