Thinking about clang plugins

Mentally, I've divided the kinds of things that I think plugins would be good for into the following categories:

1. Passive observation of the build process--imagine something like a doxygen-ish plugin which builds output documentation as you compile code. All that is needed here is callbacks to the various output structures--the ASTConsumer, PPCallbacks, and DiagnosticsConsumer are in my plugins patch already and should be sufficient to grab everything that is necessary to listen to.

2. Static checking. In particular, something like <> should be easy to verify via a plugin. The primary things that plugins need here are:
- ability to emit diagnostics (CompilerInstance gives you the engine, and the engine already has custom-diagnostics hooks).
- ability to register attributes (C++11 style definitely and possibly gnu/declspec as well, although the latter is mostly in view of possibly moving all attribute handling to a unified system)
   These probably ought to show up in __has_attribute
- ability to register pragmas
- ability to register default preprocessor macros, or maybe just __has_attribute/__has_feature support would be sufficient

3. Run custom LLVM optimizations/transformations/etc. Strictly speaking, these would be LLVM plugins, but making it easy to run custom optimizations via clang helps for debugging code would help people developing them a lot. The hooks such plugins really ought to have:
- ability to manipulate the pass list (not just the optimization pass list but also codegen as well)
- ability to add in extra libraries to link against
- ability to add extra passes to LTO
- ability to do some job control (e.g., run library post-processing pass)

4. AST-level instrumentation and modification. It's pretty self-explanatory what needs to be done here, but that doesn't make it easy. :slight_smile:

Prototypes I've developed:
I've developed a prototype for being able to register custom LTO passes without needing to rebuild libLTO. This can't really be used in the same breath as clang plugins if it uses any clang hooks due to unresolved clang symbols, and it also uses a hack to get around gold loading plugins with RTLD_LOCAL instead of RTLD_GLOBAL, which otherwise prohibits the LLVM pass manager builder global from working properly.

I also have the start of custom attribute support working, but getting the handlers from the plugin to Sema is impossible under the current plugin architecture, so I haven't been able to test it yet.

Naturally, I have the basics for decent clang plugin APIs working as well (see recent posts on cfe-commits).

API notes:
If all of these get implemented under the banner of "clang plugins", we get that a plugin would need to be loaded in three different places:
* Clang driver (job control, extra link libraries)
* Linker (LTO)
* Clang -cc1 (everything else)

Using one library for all of these places is probably the most natural in terms of specifying things on the command line, but it could result in at least three different versions of all global objects. I don't purport to have a solution to this problem.

For now, I've been using a callbacks parameter in clang_plugin_begin_file to pass data back, which works well for the things mentioned in the first category, but it doesn't work so well to pass back things like attributes or pragmas, which plugins might register one or a hundred of. ArrayRef and more functions probably work better for this instead of having one function which registers everything.

So what this really boils down to is the following questions:
1. What API should plugins export? Should we prefer individual get_callback_objects or have a few methods that take a parameter on which the plugin sets the callback objects?
2. How to solve the problem of compiling via multiple processes?
3. Naming. I suck at this, by the way. :slight_smile:
4. How much should we prefer POD-based APIs versus aggressively reusing the STL?