[RFC][StaticAnalyzer] Fix false negative

Hello!

I am investigating a false negative for this code:

    void f(unsigned long a) {
      char s[10];
      if (a >= 20)
        s[a] = 'a'; // <- array index out of bounds
    }

This is how I run Clang:

$ clang -cc1 -analyze -analyzer-checker=alpha 1.c

When I set a breakpoint in ArrayBoundChecker::checkLocation() and look at ProgramState, it looks incomplete. here is the dump output:

    Store (direct and default bindings), 0x0 :

    Expressions:
     (0x75af750,0x75569b0) s[a] : &element{s,reg_$0<a>,char}
     (0x75af750,0x75569f0) 'a' : 97 S8b
     (0x75af750,0x75a40c0) s[a] = 'a' : 97 S8b
    Ranges are empty.

Before ArrayBoundChecker::checkLocation() is executed, dead symbols are removed from the ProgramState.
An ugly hack to fix the FN is to comment out the code in RangeConstraintManager::removeDeadBindings. Then the ProgramState dump output will become:

    Store (direct and default bindings), 0x0 :

    Expressions:
     (0x75ae780,0x75559d0) s[a] : &element{s,reg_$0<a>,char}
     (0x75ae780,0x7555a10) 'a' : 97 S8b
     (0x75ae780,0x75a30f0) s[a] = 'a' : 97 S8b
    Ranges of symbol values:
     reg_$0<a> : { [20, 18446744073709551615] }

With that ProgramState the bug is found by ArrayBoundChecker.

Now I would like to fix this FN properly. I am thinking about modifying below ExprEngine::ProcessStmt(). Existing code:

    void ExprEngine::ProcessStmt(const CFGStmt S,
                                 ExplodedNode *Pred) {
      // Reclaim any unnecessary nodes in the ExplodedGraph.
      G.reclaimRecentlyAllocatedNodes();

      const Stmt *currStmt = S.getStmt();
      PrettyStackTraceLoc CrashInfo(getContext().getSourceManager(),
                                    currStmt->getLocStart(),
                                    "Error evaluating statement");

      // Remove dead bindings and symbols.
      ExplodedNodeSet CleanedStates;
      if (shouldRemoveDeadBindings(AMgr, S, Pred, Pred->getLocationContext())){
        removeDead(Pred, CleanedStates, currStmt, Pred->getLocationContext());
      } else
        CleanedStates.Add(Pred);

      // Visit the statement.
      ExplodedNodeSet Dst;
      for (ExplodedNodeSet::iterator I = CleanedStates.begin(),
                                     E = CleanedStates.end(); I != E; ++I) {
        ExplodedNodeSet DstI;
        // Visit the statement.
        Visit(currStmt, *I, DstI);
        Dst.insert(DstI);
      }

      // Enqueue the new nodes onto the work list.
      Engine.enqueue(Dst, currBldrCtx->getBlock(), currStmtIdx);
    }

The reg_$0<a> symbol should be removed after the statement has been visited. I am thinking about:
* Call the RangeConstraintManager::removeDeadBindings after the for loop. The CleanedStates must still be populated properly before the for loop.
* Change so symbols used in currStmt are not seen as dead.

Do you have any comments?

Best regards,
Daniel Marjamäki

..................................................................................................................
Daniel Marjamäki Senior Engineer
Evidente ES East AB Warfvinges väg 34 SE-112 51 Stockholm Sweden

Mobile: +46 (0)709 12 42 62
E-mail: Daniel.Marjamaki@evidente.se

www.evidente.se

Hello Daniel,

> Expressions:
> (0x75af750,0x75569b0) s[a] : &element{s,reg_$0<a>,char}

> Ranges of symbol values:
> reg_$0<a> : { [20, 18446744073709551615] }

> Before ArrayBoundChecker::checkLocation() is executed, dead symbols are removed from the ProgramState.

I think i had recently made a patch for this issue: the range is removed from the constraint manager because the only reference to the symbol is from the element region index [in the environment value], which isn't accounted for during garbage collection. The review is http://reviews.llvm.org/D12726, you can see if this patch fixes the problem.

Whoops, sorry for the accidental comma in the URL. The correct link is http://reviews.llvm.org/D12726 - and i also checked that your problem actually gets fixed this way :slight_smile:

Hello Artem!

I think i had recently made a patch for this issue: the range is removed
from the constraint manager because the only reference to the symbol is
from the element region index [in the environment value], which isn't
accounted for during garbage collection. The review is
http://reviews.llvm.org/D12726, you can see if this patch fixes the problem.

Great! That patch fixes my false negative also. :slight_smile:

Best regards,
Daniel Marjamäki

..................................................................................................................
Daniel Marjamäki Senior Engineer
Evidente ES East AB Warfvinges väg 34 SE-112 51 Stockholm Sweden

Mobile: +46 (0)709 12 42 62
E-mail: Daniel.Marjamaki@evidente.se

www.evidente.se