Clang static analyzer: ignore library headers?

I’d want to enable analysis of project headers (-analyzer-opt-analyze-headers), but would like to ignore third-party library headers. Currently I have an ad hoc, in-checker, solution to this problem, but was wondering if there’s an easier way. Would be nice if there was a blacklist of directories or something.

Thanks,
Aemon

We don't have a list of known third-party headers anywhere, but having this seems generally useful.

Anna.

Hi Anna,

I would be happy with some way to specify a local blacklist. At the moment, once I enable the analyzer-header option, it doesn’t seem like I have any sway over which headers the checker is run over. I suspect my checker is doing a lot of extra work grinding through library headers.

(it may help to know that this is an intraprocedural analysis, so I don’t care about chasing arguments into libraries)

Thanks, Aemon

Are you mostly concerned abut performance implications of analyzing third party headers?
One might also not want to show issues reported in those, in which case the black list might also be useful.

You could add another option to the analyzer that enables checking only in the project headers. This would be similar to how !AnalyzeAll option works - we just skip the functions defined in headers in AnalysisConsumer. (Note, that if you just wanted to suppress the issues, you could most likely do it from the checker.)

If you are to add another option, note that we are switching to a new method of defining options - you should not need to modify CompilerInviocation, for example, see “shouldReportIssuesInMainSourceFile” .

Anna.

Yes, I’m mostly interested in hiding issues from those headers (performance is a side benefit).

I’ll look into the approaches you suggested.

Thanks very much,
Aemon

Yes, I’m mostly interested in hiding issues from those headers (performance is a side benefit).

One thing to remember is that even if you don’t check the functions from the headers as the main entry points, you might still get issues from those files reported; specifically, while analyzing the functions from your source files that call them. Possibly, the best solution is to not analyze the extra functions(for performance) and suppress reports coming from the third party libraries as well. There is one issue though. Since the analyzer bugs correspond to paths, you might have a path that spans through your code and the third party code. There is usually no way of identifying which code actually causes the problem…

I think it would be generally valuable to have a mode where issues coming from black listed libraries would be hidden. You could do it from the checker as I’ve mentioned, but you could also add an experimental (off by default) option to the analyzer and suppress issues inside BugReporter. This would make it work for all checkers. shouldReportIssuesInMainSourceFile is a very good example of how you would do that.

Determining them based on -I vs -isystem would seem to be a sensible
distinction.

At least, that's what I use to include third party libraries when I want
to ignore warnings they produce. It's entirely possible I'm abusing
that feature.

Ben

It’s a bit of an abuse but likely not a harmful one. Huh, we don’t actually have that mode now. Regular mode runs path-sensitive checks on the main source file and non-path-sensitive checks on the main source file + non-system headers. -analyzer-opt-analyze-headers mode runs path-sensitive checks everywhere. Patches welcome?

As for an explicit blacklist, we’ve gotten that request before, but only for main source files: PR2706 and rdar://problem/9063762. A header file filter would be slightly different.

Jordan

Determining them based on -I vs -isystem would seem to be a sensible
distinction.

At least, that's what I use to include third party libraries when I want
to ignore warnings they produce. It's entirely possible I'm abusing
that feature.

It's a bit of an abuse but likely not a harmful one.

So, something like: -analyzer-opt-ignore-system-headers ?
This would fix my situation, provided I'm willing to go through my build
command and replace several instances of -I with -isystem for the purposes
of analysis. I don't think that would break the semantics of my build, but
I'm not sure you can say that in general. Changing from -I to -isystem will
cause the directory to be searched after all -I directories, so if your
build is order sensitive, that might cause problems.

Logistical question: does clang or llvm contain a good data-structure for efficient string lookup by prefix? I’m sketching out a blacklist patch with a vector at the moment.
Thanks

To be more specific: I want to compare a file path (possibly relative), to a list of strings and determine if any of the strings are a prefix of the path.

Aemon Cannon <aemoncannon@gmail.com>
writes:

To be more specific: I want to compare a file path (possibly relative), to
a list of strings and determine if any of the strings are a prefix of the
path.

Logistical question: does clang or llvm contain a good data-structure for
efficient string lookup by prefix? I'm sketching out a blacklist patch
with a vector<string> at the moment.

Not by prefix, but you can insert the paths and its parents on a
StringMap ($LLVM/include/llvm/ADT/StringMap.h) then decompose your file
path and lookup the resulting prefix paths.

But there is no need for fancy data structures: your vector<string>
(sorted) should be fast enough.

Normalizing the paths first would make things easier. Also, keep in
mind the case-insensitive file systems.

I have a work-in-progress patch I’d like to share. Feedback appreciated.

Goal:
I want to be able to analyze the code in my headers while ignoring code from third-party headers that are included in my sources. Adds a new analyzer option, analyzer-ignore-headers, for explicitly blacklisting headers by prefix.

TODO:

  • Silence warnings from code in blacklisted headers when that code is inlined into sources.
  • Check case-sensitivity of the local filesystem and choose string Compare implementation accordingly.

Misc:

  • I decided against the -isystem option because I wanted something more targeted that doesn’t require modifying the build command.
  • I’m not doing any path expansion or normalization, just a dumb string comparison. Is this sufficient?

Thanks!

diff --git a/include/clang/Driver/CC1Options.td b/include/clang/Driver/CC1Options.td
index 9acbd48…cc64c34 100644
— a/include/clang/Driver/CC1Options.td
+++ b/include/clang/Driver/CC1Options.td
@@ -111,6 +111,9 @@ def analyzer_checker_help : Flag<["-"], “analyzer-checker-help”>,
def analyzer_config : Separate<["-"], “analyzer-config”>,
HelpText<“Choose analyzer options to enable”>;

+def analyzer_ignore_headers : Separate<["-"], “analyzer-ignore-headers”>,

  • HelpText<“Ignore headers by prefix.”>;

Hi Aemon,

Below are some high-level comments.

Thanks for working on this!
Anna.

I have a work-in-progress patch I’d like to share. Feedback appreciated.

Generally, we send patches to cfe-dev.

Goal:
I want to be able to analyze the code in my headers while ignoring code from third-party headers that are included in my sources. Adds a new analyzer option, analyzer-ignore-headers, for explicitly blacklisting headers by prefix.

TODO:

  • Silence warnings from code in blacklisted headers when that code is inlined into sources.
  • Check case-sensitivity of the local filesystem and choose string Compare implementation accordingly.

Misc:

  • I decided against the -isystem option because I wanted something more targeted that doesn’t require modifying the build command.
  • I’m not doing any path expansion or normalization, just a dumb string comparison. Is this sufficient?

Thanks!

diff --git a/include/clang/Driver/CC1Options.td b/include/clang/Driver/CC1Options.td
index 9acbd48…cc64c34 100644
— a/include/clang/Driver/CC1Options.td
+++ b/include/clang/Driver/CC1Options.td
@@ -111,6 +111,9 @@ def analyzer_checker_help : Flag<["-"], “analyzer-checker-help”>,
def analyzer_config : Separate<["-"], “analyzer-config”>,
HelpText<“Choose analyzer options to enable”>;

+def analyzer_ignore_headers : Separate<["-"], “analyzer-ignore-headers”>,

  • HelpText<“Ignore headers by prefix.”>;

We are moving toward not exposing/implementing analyzer options support in the clang Driver code, but rather keeping it within the analyzer. You can see that some of the options are implemented that way (for example, see AnalyzerOptions::mayInlineCXXMemberFunction). Would it be possible to implement “analyzer-ignore-headers” in the same way?

//===----------------------------------------------------------------------===//
// Migrator Options
//===----------------------------------------------------------------------===//
diff --git a/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h b/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h
index 618782e…45e75d1 100644
— a/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h
+++ b/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h
@@ -122,6 +122,7 @@ public:

/// \brief Pair of checker name and enable/disable.
std::vector<std::pair<std::string, bool> > CheckersControlList;

/// \brief A key-value table of use-specified configuration values.
ConfigTable Config;
@@ -267,6 +268,8 @@ public:
/// \sa CXXMemberInliningMode
bool mayInlineCXXMemberFunction(CXXInlineableMemberKind K);

  • bool headerIsBlacklisted(StringRef IncludePath);

/// Returns true if ObjectiveC inlining is enabled, false otherwise.
bool mayInlineObjCMethod();

diff --git a/lib/Frontend/CompilerInvocation.cpp b/lib/Frontend/CompilerInvocation.cpp
index b0d7a0a…6ed2abc 100644
— a/lib/Frontend/CompilerInvocation.cpp
+++ b/lib/Frontend/CompilerInvocation.cpp
@@ -280,6 +280,28 @@ static bool ParseAnalyzerArgs(AnalyzerOptions &Opts, ArgList &Args,
}
}

  • // Go through the analyzer header blacklist.
  • for (arg_iterator it = Args.filtered_begin(OPT_analyzer_ignore_headers),
  • ie = Args.filtered_end(); it != ie; ++it) {
  • const Arg *A = *it;
  • A->claim();
  • StringRef headerList = A->getValue();
  • SmallVector<StringRef, 4> headerVals;
  • headerList.split(headerVals, “,”);
  • for (unsigned i = 0, e = headerVals.size(); i != e; ++i) {
  • StringRef val = headerVals[i];
  • if (val.empty()) {
  • // TODO(aemon): properly report problem
  • // Diags.Report(SourceLocation(),
  • // diag::err_analyzer_header_no_value) << headerVals[i];
  • Success = false;
  • break;
  • }
  • Opts.HeaderBlacklist.push_back(val.str());
  • }
  • }
  • std::sort(Opts.HeaderBlacklist.begin(), Opts.HeaderBlacklist.end());

return Success;
}

diff --git a/lib/StaticAnalyzer/Checkers/CMakeLists.txt b/lib/StaticAnalyzer/Checkers/CMakeLists.txt
index 392995e…123e84f 100644
— a/lib/StaticAnalyzer/Checkers/CMakeLists.txt
+++ b/lib/StaticAnalyzer/Checkers/CMakeLists.txt
@@ -50,6 +50,8 @@ add_clang_library(clangStaticAnalyzerCheckers
ObjCMissingSuperCallChecker.cpp
ObjCSelfInitChecker.cpp
ObjCUnusedIVarsChecker.cpp

  • OwnershipChecker.cpp
  • TracingChecker.cpp
    PointerArithChecker.cpp
    PointerSubChecker.cpp
    PthreadLockChecker.cpp
    diff --git a/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp b/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp
    index 9dcf58b…3cfde27 100644
    — a/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp
    +++ b/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp
    @@ -96,6 +96,16 @@ AnalyzerOptions::mayInlineCXXMemberFunction(CXXInlineableMemberKind K) {
    return CXXMemberInliningMode >= K;
    }

+bool AnalyzerOptions::headerIsBlacklisted(StringRef IncludePath) {

  • if (HeaderBlacklist.size() == 0) return false;
  • std::vectorstd::string::const_iterator SI =
  • lower_bound(HeaderBlacklist.begin(), HeaderBlacklist.end(),
  • IncludePath);
  • if (SI != HeaderBlacklist.end() && *SI == IncludePath) return true;
  • if (SI == HeaderBlacklist.begin()) return false;
  • return IncludePath.startswith(*(–SI));
    +}

static StringRef toString(bool b) { return b ? “true” : “false”; }

bool AnalyzerOptions::getBooleanOption(StringRef Name, bool DefaultVal) {
diff --git a/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp b/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
index 9efe997…0a13906 100644
— a/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
+++ b/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
@@ -414,13 +414,23 @@ void AnalysisConsumer::HandleTopLevelDeclInObjCContainer(DeclGroupRef DG) {
}

void AnalysisConsumer::storeTopLevelDecls(DeclGroupRef DG) {

  • SourceManager &SM = Ctx->getSourceManager();
    for (DeclGroupRef::iterator I = DG.begin(), E = DG.end(); I != E; ++I) {

// Skip ObjCMethodDecl, wait for the objc container to avoid
// analyzing twice.
if (isa(*I))
continue;

I’d add this to AnalysisConsumer::getModeForDecl(), where the analysis mode/policy is defined, instead of AnalysisConsumer::storeTopLevelDecls().

  • SourceLocation SL = (*I)->getLocation();
  • if (!SM.isInMainFile(SL) && SL.isFileID()) {
  • if (const char* IncludePath = SM.getBufferName(SL)) {
  • StringRef S(IncludePath);
  • if (Opts->headerIsBlacklisted(S)) {
  • continue;
  • }
  • }
  • }

LocalTUDecls.push_back(*I);
}
}

There is no test for the command line case.

Hi Aemon,

Below are some high-level comments.

Thanks for working on this!
Anna.

I have a work-in-progress patch I’d like to share. Feedback appreciated.

Generally, we send patches to cfe-dev.

I ment to say “we send patches to cfe-commits mailing list (http://clang.llvm.org/hacking.html)”.