[analyzer] Simple Example produces an inconsistent result

Hello Scott,

You have touched a very sensitive moment.

TL;DR: The reason warning does not appear is that function is not analyzed out of context if it was inlined before. Functions are analyzed in topological order. In your case, inlining of 'foo' does not touch (len < 10 == true) branch so it will never be analyzed.

But, more thoroughly, CSA has some issues with topological sorting. First, it is not clear if topological sorting is really required since it throws some possible execution paths away (as in Scott's example). Second, it does not work as expected because of some call graph issues (I made some attempts to resolve them and hope to contribute these patches).

However, the assumption that function is analyzed only in given contexts can reduce amount of false positives.

In our opinion, we should analyze all the functions and skip out-of-context analysis of functions that are:
1. not externally visible or
2. are private class members.

But this is a subject for discussion.

+1, currently we loose lots of interesting paths for e.g. shared libs.

-Y

Hello Scott,

You have touched a very sensitive moment.

TL;DR: The reason warning does not appear is that function is not analyzed out of context if it was inlined before. Functions are analyzed in topological order. In your case, inlining of ‘foo’ does not touch (len < 10 == true) branch so it will never be analyzed.

But, more thoroughly, CSA has some issues with topological sorting. First, it is not clear if topological sorting is really required since it throws some possible execution paths away (as in Scott’s example).

The analyzer throws away the execution paths that cannot be triggered by the current translation unit mainly as a performance heuristic. Topological order just ensures that we maximize the effect of this heuristic. If we were to inline leaf functions first, it would not get triggered at all. This can also reduce a number of false positives since the called function might assume some knowledge about its callers.

Second, it does not work as expected because of some call graph issues (I made some attempts to resolve them and hope to contribute these patches).

I cannot comment on this since I do not know what they are.

However, the assumption that function is analyzed only in given contexts can reduce amount of false positives.

Yes, that is one of the advantages of the current approach. (And would definitely be a very a good reason for keeping this mode for cross translation unit analysis.)

In our opinion, we should analyze all the functions and skip out-of-context analysis of functions that are:

  1. not externally visible or
  2. are private class members.

This can definitely be done behind a flag. I expect the performance impact to be significant.

But this is a subject for discussion.

Hi All,

Given the following code:

// test.cpp
int foo(int len) {
int j = 0;
if (len < 10)
j = 42 / j;
return j;
}

the command

clang --analyze test.cpp

issues the bug report

tu.cpp:6:10: warning: Division by zero
j = 42 / j;


However, it seems that merely introducing another function which calls
foo() with an argument that would not trigger a division by zero nullifies
the bug report. For instance, analyzing

// test.cpp
int foo(int len) {
int j = 0;
if (len < 10)
j = 42 / j;
return j;
}

void bar() {
int m = 12;
foo(m);
}

in the same way will NOT issue a bug report. Isn't this a bug in the static
analyzer?

What you see here is the following heuristic being triggered:

void AnalysisConsumer::HandleDeclsCallGraph(const unsigned LocalTUDeclsSize) {
// Build the Call Graph by adding all the top level declarations to the graph.
// Note: CallGraph can trigger deserialization of more items from a pch
// (though HandleInterestingDecl); triggering additions to LocalTUDecls.
// We rely on random access to add the initially processed Decls to CG.
CallGraph CG;
for (unsigned i = 0 ; i < LocalTUDeclsSize ; ++i) {
CG.addToCallGraph(LocalTUDecls[i]);
}

// Walk over all of the call graph nodes in topological order, so that we
// analyze parents before the children. Skip the functions inlined into
// the previously processed functions. Use external Visited set to identify
// inlined functions. The topological order allows the “do not reanalyze
// previously inlined function” performance heuristic to be triggered more
// often.

And also some more details on this here:

static bool shouldSkipFunction(const Decl *D,
const SetOfConstDecls &Visited,
const SetOfConstDecls &VisitedAsTopLevel) {
if (VisitedAsTopLevel.count(D))
return true;

// We want to re-analyse the functions as top level in the following cases:
// - The ‘init’ methods should be reanalyzed because
// ObjCNonNilReturnValueChecker assumes that ‘[super init]’ never returns
// ‘nil’ and unless we analyze the ‘init’ functions as top level, we will
// not catch errors within defensive code.
// - We want to reanalyze all ObjC methods as top level to report Retain
// Count naming convention errors more aggressively.
if (isa(D))
return false;

// Otherwise, if we visited the function before, do not reanalyze it.
return Visited.count(D);
}