[RFC PATCH] making the static analyzer a PluginASTAction

Not sure how to post large changes; I squashed the changes into a
single patch attached here. Most of its size is from updating
test/Analysis/. I was working on it as a series of ten patches though,
if anyone wants a git branch for easier review.

This change mostly moves the option parsing code out of libFrontend
into libStaticAnalyzer. It can't compiler the analyzer as a separate
loadable module, yet.

Ultimately, this could remove the dependencies that libFrontendTool
has on all the static analyzer libraries, but can't quite yet because
the final change I tried to make (removing the RunAnalysis enum and
the related code) mysteriously caused "unable to find plugin
'analyzer'" regressions, so I left that change out.

Also allll the tests were updated from using `clang -cc1 -analyze` to
using `clang -cc1 -plugin analyzer`, with some copious
-plugin-arg-analyzer sprinkled in between to pass the arguments to the
plugin layer.

Somehow, this change exposed a problem in lib/Driver/Tools.cpp, which
was by default adding "-static-analyzer-checker=security" to the
analyzer. I'm not sure how it ever worked before, because after this
change I got the error "no analyzer checkers are associated with
'security'", which was true; changing it to "experimental.security"
fixed it.

There is a small hack in InitPreprocessor.cpp now, though, since I'm
not sure how to add a predefined macro ("__clang_analyzer__") from a
PluginASTAction.

0009-StaticAnalyzer-factor-into-an-AST-plugin.patch (173 KB)

Not sure how to post large changes; I squashed the changes into a
single patch attached here. Most of its size is from updating
test/Analysis/. I was working on it as a series of ten patches though,
if anyone wants a git branch for easier review.

This change mostly moves the option parsing code out of libFrontend
into libStaticAnalyzer. It can't compiler the analyzer as a separate
loadable module, yet.

Ultimately, this could remove the dependencies that libFrontendTool
has on all the static analyzer libraries, but can't quite yet because
the final change I tried to make (removing the RunAnalysis enum and
the related code) mysteriously caused "unable to find plugin
'analyzer'" regressions, so I left that change out.

Also allll the tests were updated from using `clang -cc1 -analyze` to
using `clang -cc1 -plugin analyzer`, with some copious
-plugin-arg-analyzer sprinkled in between to pass the arguments to the
plugin layer.

Somehow, this change exposed a problem in lib/Driver/Tools.cpp, which
was by default adding "-static-analyzer-checker=security" to the
analyzer. I'm not sure how it ever worked before, because after this
change I got the error "no analyzer checkers are associated with
'security'", which was true; changing it to "experimental.security"
fixed it.

There is a small hack in InitPreprocessor.cpp now, though, since I'm
not sure how to add a predefined macro ("__clang_analyzer__") from a
PluginASTAction.

This seems to be stuck in the moderation queue, which is just as well,
since the patch had some problems that I fixed now.

The current changes are in this git branch:
https://github.com/nobled/clang/tree/anal-plugin

Thanks for working on this. I'm sorry it looks like I missed your original email. I'm interested in this work, although I haven't had a chance to look at it yet to give you any technical feedback.

With holiday time away coming up, I may not get a chance to seriously review this work until the beginning of January. My apologies if I can't get to it sooner than that, but this would be a disruptive enough change that I really want to be confident with the design and direction before we move ahead with changes. I also want the other active developers on the static analyzer to get a chance to thoroughly review the work, and they may have similar bandwidth issues with the holiday season. I'll try and give you feedback as soon as possible.

Thanks for your patience!

Cheers,
Ted

Ah, I see it hadn't been moderated yet for approval to the list (per your comment).

Experimental checkers should not be added by default, so if there are no checkers in the security package, it can be dropped from the set of defaults.

Thanks for working on this!
Anna.

Happy that someone's working on this! Unfortunately I don't have too many cycles right now either, but here are some small comments based on a quick read-through of your patch.

My larger issue is how much people depend on the current argument set for clang -cc1. Anyone using -Xanalyzer like they "should" will be fine, but those who directly call -cc1 (or mistakenly use -Xclang) will see everything break. (Now I see why these have been kept separate!) For maximum compatibility we'd keep all the old options as aliases, but that seems silly if this is really the direction we're going. It might be nice to still accept -analyzer, though, even if all we do is emit a warning that says '-analyze has been disabled, use -plugin analyzer instead'.

Plugin arguments are really ugly right now. Maybe someday we can have some kind of argument grouping system, i.e. "all options in this section are intended for the 'foo' plugin". But that's not your problem. :slight_smile:

I can't comment on some of the larger design choices right now, like exactly /what/ the separation of duties should be between the driver and the plugins, and where OptTable and argument parsing should live. But I'm glad you're pushing this; it's a step towards being able to distribute the analyzer and the compiler separately, which could someday turn out to be useful. (...apart from shrinking download sizes, of course.)

Thank you!
Jordy

--- a/lib/Driver/Tools.cpp
+++ b/lib/Driver/Tools.cpp
@@ -1213,6 +1213,12 @@
static bool shouldUseFramePointer(const ArgList &Args,
   return true;
}

+#define addPluginArg(CmdArgs, plugin, argument) \
+do {\
+ CmdArgs.push_back("-plugin-arg-" plugin);\
+ CmdArgs.push_back(argument);\
+} while (0)

Happy that someone's working on this! Unfortunately I don't have too many cycles right now either, but here are some small comments based on a quick read-through of your patch.

My larger issue is how much people depend on the current argument set for clang -cc1. Anyone using -Xanalyzer like they "should" will be fine, but those who directly call -cc1 (or mistakenly use -Xclang) will see everything break. (Now I see why these have been kept separate!) For maximum compatibility we'd keep all the old options as aliases, but that seems silly if this is really the direction we're going. It might be nice to still accept -analyzer, though, even if all we do is emit a warning that says '-analyze has been disabled, use -plugin analyzer instead'.

Believe it or not, -Xanalyzer predates the generic -Xclang, so it's
just a lucky accident of history I had this handy abstraction already
in place to take advantage of. (and it was originally "-WA", by
analogy to '-Wl' and friends. shudder.)

Is -cc1 backwards compatibility actually important, though? It's
basically an implementation detail of clang, I thought.

Plugin arguments are really ugly right now. Maybe someday we can have some kind of argument grouping system, i.e. "all options in this section are intended for the 'foo' plugin". But that's not your problem. :slight_smile:

I can't comment on some of the larger design choices right now, like exactly /what/ the separation of duties should be between the driver and the plugins, and where OptTable and argument parsing should live. But I'm glad you're pushing this; it's a step towards being able to distribute the analyzer and the compiler separately, which could someday turn out to be useful. (...apart from shrinking download sizes, of course.)

Yeah, and hopefully, next we can dump the ARC Migration Tool into a
plugin too(seeing as how it's waaay more domain-specific than the
analyzer), but that requires a new API entirely for generic 'rewrite
the source file' plugin actions that can be chained up before a
'primary' action like codegen. (The current -add-plugin option
'multiplexes' ASTConsumers in parallel, but ARCMT needs to be chained
up sequentially.)

Thank you!
Jordy

--- a/lib/Driver/Tools.cpp
+++ b/lib/Driver/Tools.cpp
@@ -1213,6 +1213,12 @@
static bool shouldUseFramePointer(const ArgList &Args,
return true;
}

+#define addPluginArg(CmdArgs, plugin, argument) \
+do {\
+ CmdArgs.push_back("-plugin-arg-" plugin);\
+ CmdArgs.push_back(argument);\
+} while (0)
+

Minor thing: can we give this a more macro-y name? I see why it has to be a macro (for the string concatenation), but it seems like it's asking for trouble to give it a function name.

You mean like an ALL_CAPS name like ADD_PLUGIN_ARG()?

--- /dev/null
+++ b/lib/StaticAnalyzer/Frontend/AnalyzerOptions.cpp
+namespace {
+
+class AnalyzerOptTable : public driver::OptTable {
+public:
+ AnalyzerOptTable()
+ : OptTable(AnalyzerInfoTable,
+ sizeof(AnalyzerInfoTable) / sizeof(AnalyzerInfoTable[0])) {}
+};
+
+}
+
+driver::OptTable *clang::createAnalyzerOptTable() {
+ return new AnalyzerOptTable();
+}

Uh, why not just return "new driver::OptTable(...)"? Maybe someday we'll need a special class for this, but not now. (Not sure why the other option tables in Driver do this too.)

Yeah, I was just blindly following the existing style. Not sure why either.

--- a/lib/Frontend/CompilerInvocation.cpp
+++ b/lib/Frontend/CompilerInvocation.cpp
@@ -912,7 +819,6 @@
}

void CompilerInvocation::toArgs(std::vector<std::string> &Res) {
- AnalyzerOptsToArgs(getAnalyzerOpts(), Res);
CodeGenOptsToArgs(getCodeGenOpts(), Res);
DependencyOutputOptsToArgs(getDependencyOutputOpts(), Res);
DiagnosticOptsToArgs(getDiagnosticOpts(), Res);

You've moved this functionality to the plugin, but you haven't actually called it again from here. This is saved a bit by the fact that I think no one actually /uses/ CompilerInvocation::toArgs to record analyzer invocations, but if we're being proper CompilerInvocation should (eventually?) be able to toArgs and CreateFromArgs /any/ plugin's options, right?

I was thinking about this already actually, and it should be able to
just dump all the plugin arguments from the
ActionName/PluginArgs/AddPluginActions/AddPluginArgs members, exactly
as they were passed to it. Which only has the disadvantage of not
eliminating redundant or ignored options, but that's not really a huge
deal.

--- a/lib/StaticAnalyzer/Frontend/CMakeLists.txt
+++ b/lib/StaticAnalyzer/Frontend/CMakeLists.txt
@@ -13,6 +14,7 @@
add_dependencies(clangStaticAnalyzerFrontend
clangStaticAnalyzerCheckers
clangStaticAnalyzerCore
+ ClangStaticAnalyzerOptions
ClangAttrClasses
ClangAttrList
ClangDeclNodes

What is this? I don't see a "ClangStaticAnalyzerOptions" library anywhere.

That's the target for the new TableGen-generated header...

--- a/lib/Frontend/InitPreprocessor.cpp
+++ b/lib/Frontend/InitPreprocessor.cpp
@@ -530,7 +530,9 @@
Builder.defineMacro("__weak", "__attribute__((objc_gc(weak)))");

// Define a macro that exists only when using the static analyzer.
- if (FEOpts.ProgramAction == frontend::RunAnalysis)
+ // XXX: This is a total hack, this should be added from the plugin.
+ if (FEOpts.ProgramAction == frontend::PluginAction
+ && FEOpts.ActionName == "analyzer")
Builder.defineMacro("__clang_analyzer__");

How about "CI->getPreprocessorOpts().addMacroDef(...)"? I'm not sure where to put it, exactly, but PluginASTAction has a lot of access to the CompilerInvocation. (I believe BeginInvocation would work, but there ought to be a place where you only have to do it once, instead of once per file, right?)

During ParseArgs, I guess?

--- a/lib/StaticAnalyzer/Frontend/FrontendActions.cpp
+++ b/lib/StaticAnalyzer/Frontend/FrontendActions.cpp
@@ -17,7 +29,245 @@
return CreateAnalysisConsumer(CI.getPreprocessor(),
CI.getFrontendOpts().OutputFile,
- CI.getAnalyzerOpts(),
+ this->Opts,
CI.getFrontendOpts().Plugins);
}

Very small thing: convention in Clang is not to use 'this'. (You've done this several times in this file.)

+// TODO: parsing all the current arguments is gonna require factoring
+// the Arg* and Opt* classes out of lib/Driver into lib/Basic,
+// and splitting the -analyzer-* options out of Driver/CC1Options.td
+// into a new file, StaticAnalyzer/AnalyzerOptions.td.

...which you've now done already, right?

Yeah, that was from a way early draft; already deleted in the newer
version of the patch in the branch.

It turned out to not be strictly necessary for this work, but I did
finally finish a patch to move Arg-parsing into libBasic. I'll send
that out, too.

+ driver::OptTable *Table = createAnalyzerOptTable();
+ std::vector<const char*> cargs;
+ for (unsigned i = 0, e = args.size(); i != e; ++i)
+ cargs.push_back(args[i].c_str());
+
+ const char *const *ArgBegin = &cargs[0];
+ const char *const *ArgEnd = ArgBegin + cargs.size();
+ unsigned MissingArgIndex, MissingArgCount;
+ driver::InputArgList *Args = Table->ParseArgs(ArgBegin, ArgEnd,
+ MissingArgIndex, MissingArgCount);

Yuck. Can we tag this with a FIXME? ParseArgs should really take an ArrayRef anyway, and it ought to be able to handle std::vector<std::string> if we're going to use it that way.

Yeah, I was looking at ArrayRef-izing and StringRef-izing whatever I
could in libDriver after this (it's... painful). I'll add an explicit
FIXME.

+ bool success = Args && ParseAnalyzerArgs(this->Opts, *Args, D);

Personally I'd prefer early returns: if (!Args), then if (!ParseAnalyzerArgs).

Already fixed in the new patch (already switched to using OwningPtr).

+ // Honor -analyzer-checker-help.
+ // This should happen AFTER plugins have been loaded!
+ if (this->Opts.ShowCheckerHelp)
+ ento::printCheckerHelp(llvm::outs(), CI->getFrontendOpts().Plugins);

The old behavior of -analyzer-checker-help is to terminate after printing the help. Not sure if we want to keep this or not, but at the very least we shouldn't get a "no inputs" error. (Maybe I should write a simple test case for this...).

You're right. Not sure how to fix that, though. ParseArgs can only
return 'success' or 'error', and asking for -help shouldn't make the
program return an error code should it? (Unless it already does that.)
In the future maybe we should add a hook to the plugin class for a
standard 'help' option (so it's just `clang -cc1 -plugin analyzer
-plugin-help`) instead of writing a new one for every plugin.

Also, the comment about plugins is probably unnecessary once the analyzer is a plugin itself, and should be removed.

+ delete Args;
+ delete Table;

Please use LLVM's RTTI for this: llvm::OwningPtr. That way my preference for early returns won't cause a memory leak. :slight_smile:

Already fixed. Thanks for reviewing what you could!

This is my latest squashed-together version of the changes needed,
excluding the tedious test/ updates this time because this way it fits
under the mailing list size limit.

Still todo: add an equivalent to LLVMLinkInMCJIT(), like,
clang_LinkInStaticAnalyzer(), because like the JIT and MCJIT, it gets
registered via a static constructor, and I guess linkers optimize away
statically-linked libraries with no used symbols, even if they contain
static constructors with side effects like this. At least I think
that's what was causing the regressions I saw once I removed the last
dependency edge from libFrontendTool to the static analyzer code,
since the regressions don't happen with the CMake -DUSE_SHARED_LIBS
build.

0012-StaticAnalyzer-factor-into-an-AST-plugin.patch (76.4 KB)

Added clang_LinkInStaticAnalyzerPlugin(), so everything works in all
linking configurations now. It's not the ultimate solution for when
it's a loadable module, but it's good for now.

ping?

0013-StaticAnalyzer-factor-into-an-AST-plugin.patch (76.9 KB)

Attached an updated patch--it had to be rebased against recent changes
in the Driver/Tools.cpp code.

Also available as broken-down discrete steps on this git branch:
https://github.com/nobled/clang/tree/anal-plugin

0014-StaticAnalyzer-factor-into-an-AST-plugin.patch (77 KB)

Attached an updated patch--it had to be rebased against recent changes
in the Driver/Tools.cpp code.

Also available as broken-down discrete steps on this git branch:
https://github.com/nobled/clang/tree/anal-plugin

...and here are the corresponding updates to the test/ directory
(separated to fit just under the 128kb list limit). There is a ton of
churn in these test/Analysis/* files, and the diffs are really hard to
read because the RUN: lines are so long(especially after this patch),
so it'd be great to land this soon to end the tedious cycle of
rebasing and manually reinserting "-plugin-arg-analyzer" into whatever
tests were updated recently.

0015-update-all-Analysis-tests-to-new-plugin-interface.patch (120 KB)

A few additions to what I said before (some of which probably still need to be changed eventually):

AnalysisAction.h:
+ void toArgs(std::vector<std::string> &args);

Had a thought: since this isn't hooked up right now in CompilerInvocation anyway, why not make it a virtual method on PluginASTAction? Eventually we'll probably want to be able to serialize any plugin invocation anyway.

Or, the flip side: CompilerInvocation's FrontendOptsToArgs already blindly serializes all plugin options, so maybe we can just kill this.

FrontendActions.h:
+extern "C" void clang_LinkInStaticAnalyzerPlugin(void);

Yuck...but as far as I see, unavoidable in some form. This is a lot less brittle than relying on compiler options to make sure libClangStaticAnalyzer doesn't get left out.

FrontendActions.cpp:
+ // If -analyzer-checker-help was given, this should be a no-op.
+ if (Opts.ShowCheckerHelp)
+ return new ASTConsumer();

A few additions to what I said before (some of which probably still need to be changed eventually):

AnalysisAction.h:
+ void toArgs(std::vector<std::string> &args);

Had a thought: since this isn't hooked up right now in CompilerInvocation anyway, why not make it a virtual method on PluginASTAction? Eventually we'll probably want to be able to serialize any plugin invocation anyway.

Yeah, I figured I would add that method after this series so the
functionality would become available again.

Or, the flip side: CompilerInvocation's FrontendOptsToArgs already blindly serializes all plugin options, so maybe we can just kill this.

FrontendActions.h:
+extern "C" void clang_LinkInStaticAnalyzerPlugin(void);

Yuck...but as far as I see, unavoidable in some form. This is a lot less brittle than relying on compiler options to make sure libClangStaticAnalyzer doesn't get left out.

FrontendActions.cpp:
+ // If -analyzer-checker-help was given, this should be a no-op.
+ if (Opts.ShowCheckerHelp)
+ return new ASTConsumer();
+

This isn't really a no-op, since it means the compiler will still go through the whole parsing phase to build an AST. I *think* it's safe to return NULL here, but I haven't tried it.

Ah, you're right. I tested it out with my patch, and running this:

$ bin/clang -cc1 -plugin analyzer -plugin-arg-analyzer -analyzer-checker-help

will print the help, then hold, waiting for the input file on STDIN.

However, current behavior is broken in its own way already, as far as
I can tell:

$ clang --analyze -Xanalyzer -analyzer-checker-help
clang-3: error: no input files
$ clang -cc1 -analyze -analyzer-checker-help || echo returned nonzero!
<the normal -analyzer-checker-help spew>
returned nonzero!

...So if I wanted, I can just preserve the current behavior by
returning 'false' from ParseArgs, indicating failure, without actually
setting a diagnostic. Which is what the current
return-zero(false)-from-ExecuteCompilerInvocation code does,
effectively.

+ // TODO: CreateASTConsumer() is far too late to add a macro, and
+ // ParseArgs() is far too early for getCompilerInstance() to work yet,
+ // so we have to switch to a non-const CompilerInstance argument here
+ // in order to remove the plugin-specific hack from InitPreprocessor.cpp.
+ //CI.getPreprocessorOpts().addMacroDef("__clang_analyzer__");

How about BeginInvocation? Or BeginSourceFileAction? FrontendAction has those hooks specifically for this sort of thing. I haven't /tried/ it, but it's before the Preprocessor is created, and after the CompilerInstance is initialized with a file. Yes, it means we add the option for every file, but it's not like InitPreprocessor isn't doing the same work now.

(Sorry, the fixme here is really bugging me.)

Ah, of course. Done in BeginInvocation now. (I see that you mentioned
it before, but I got stuck on that not-per-file idea, doh.) There's
one small peculiarity with using addMacroDef() (the define gets listed
alongside all the driver/user-specified -D commandline options in the
preprocessed file instead of with the builtins), but it's not one that
should really matter.

+static void AnalyzerOptsToArgs(const AnalyzerOptions &Opts,
+ std::vector<std::string> &Res) {
+ if (Opts.ShowCheckerHelp)
+ Res.push_back("-analyzer-checker-help");

It's not so important (since this isn't ever called right now), but this function isn't correct anymore -- all of these options would need "-plugin-arg-analyzer" in front. But again, since plugin args are already blindly serialized by CompilerInvocation, fixing this function may not be worth it.

(Also, there's probably no reason anymore to have this outside of AnalysisAction::toArgs.)

Longer-term (can fix later): once all the analyzer plugin args go through -plugin-arg-analyzer, we can strip the 'analyzer' prefix off each option.

Yep, I had the same thought.

0016-StaticAnalyzer-factor-into-an-AST-plugin.patch (78.6 KB)

(And the tests/ patch again...)

0017-update-all-Analysis-tests-to-new-plugin-interface.patch (122 KB)

...ping?