clang static analyzer checker for unreachable code

Dear cfe-dev,

I am testing out the alpha.deadcode.UnreachableCode checker and I have found an interesting case where it fails to find an unreachable block. Here is the minified version of the code:

void f()
{
int *i = new int;
if (!i) {
return; // this code is unreachable (“new” throws an exception if not enough memory)
}
for (int j = 0; j < 4; ++j) {
}
}

I’m running the analyzer with “clang -cc1 -analyze -analyzer-checker=alpha.deadcode.UnreachableCode f.cpp”.

If I decrease the loop boundary as follows:

void f()
{
int *i = new int;
if (!i) {
return;
}
for (int j = 0; j < 3; ++j) { // NOTE: changed 4 to 3
}
}

then the CSA reports the unreachable code correctly:

f.cpp:5:5: warning: This statement is never executed
return;
^~~~~~
1 warning generated.

I suspect that the problem is due to the default unrolling depth of loops during analysis, but I don’t understand how exactly this unrolling interacts with the unreachable code checker.

Could anyone confirm if this is the expected behavior (known limitation) of the checker or if this is a bug?

Best,
Stefan

Finding dead code is an "all-paths" kind of problem: in order to find dead code, we need to see all possible execution paths in the program and observe that none of them actually goes through the code.

Symbolic execution - the method of the analyzer - is best suited for "one-path" problems, where finding one problematic path is enough (eg. null dereferences, memory leaks, various kinds of use-after-check; note, however, that check-after-use, such as TestAfterDivZeroChecker, is an "all-paths" problem, because if the same check statement follows use on some paths but not on other paths, the check makes sense and the code cannot be immediately flagged as buggy).

The experimental deadcode checker tries to utilize symbolic execution to solve this all-paths problem with the following straightforward trick: only report warnings when we believe that all possible execution paths were covered during the current analysis. Otherwise we might have missed the path on which the code is live.

It's still nice that we rely on the single symbolic execution engine to model semantics of various statements (eg. we automatically know that operator new() doesn't return null), however when we don't cover all paths we cannot be certain of any results - in these cases a more straightforward data flow analysis would be better.

Finally, because we give up on unrolling the loop, we cannot be sure that all paths were covered.

Overally, the analyzer doesn't guarantee that any part of the code or path through the code would be reliably covered (even though we do fight for better coverage), so this checker would never be reliable - but it still may find stuff, and maybe even find some stuff that other approaches don't find.

Perhaps the current GSoC project on loop widening might be able to help with this, though it isn't the focus.

Because you're actively using this experimental feature, may i briefly ask about your experience with it? Like, do you overally appreciate reports from this checker, are such reports numerous, are there many false positives? Do you think it's worth giving to the users by default?