Evaluation of Symbolic Expressions?


I have some trouble with symbolic expressions. Either I expect too much from the Static Analyzer, or – more probable – I use the wrong functions.

I have two different problems where I do not get what I expect.

The first problem seems very simple: I want to compare two symbolic expressions, one of them is something like {conj_1}, the other one is {conj_1} + 1. Using SValBuilder::evalBinOpNN() I get UnknownVal for comparison with e.g. BO_GE or BO_LT. I thought that these very simple expressions are easy to evaluate. I only get concrete integer result (1) if the two sides of the comparison are exactly the same (e.g. {conj_1} >= {conj_1}).

I also tried subtracting them from each other. Using SValBuilder::evalBinOpNN() I also get UnknownVal for subtraction, and for Symbolmanager::getSymSymExpr() I get symbolic expression ({conj_1} + 1) - {conj_1}, and then for the comparison to zero using SValBuilder::evalBinOpNN() I get symbolic expression ({conj_1} + 1) - {conj_1} > 0 instead of a concrete integer (1). This method is worse since it does not work for expressions like {conj_1} - {conj_1} >= 0 either.

The second problem seems also simple: I make an assumption for the state, like {conj_1} == {conj_2}. If I dump the state, I see the range of {conj_1} == {conj_2} [1 … max]. Later, I have comparison where I check e.g. {conj_1} >= {conj_2}, but I get UnknownVal. Checking {conj_1} == {conj_2} does not help either. I also tried the subtraction here, thus to store {cont_1} – {cont_2} in the state (range was either [0…0] or [-max…-1], [1…max]), but it did not help either.

So, how to do it correctly? I assume that such basic symbolic calculations are part of some checkers which work for a while now. Any ideas?

Thanks in advance!




The analyzer's knowledge of algebra is actually extremely minimal - it handles "x > 1 && x < 3" pretty well, but not "x > x + 1" or "x == y". This is because a full-featured *constraint solver* would be slow even for a few equations, and we'd often throw thousands of equations at him during analysis of a single file. So we have a simplified solver, called RangeConstraintManager, which handles only simple equations, but does it very quickly.

Currently we are discussing a patch that would provide the analyzer with a full-featured solver (Microsoft's Z3, see https://reviews.llvm.org/D28952), but we have very little idea of how practical that would be.

Whenever you're seeing an UnknownVal, it indicates that we failed to support an operation for some reason. Most checkers react to this by skipping the analysis, deciding that the code seems to be too complex to analyze, and it's more important to avoid false positives than to try find a bug in a code we don't understand.

Even though a SymSymExpr of desired shape can be constructed manually, it would not be of much use because the constraint manager would fail to handle it properly anyway.

If you definitely need to solve complicated constraints in your checker, you might want to cover your cases in the constraint manager, which is easily extensible. Or you may want to maintain some extra information on the checker side, like "this iterator object is to the left from that iterator object" - you could terminate the analysis when an iterator is both to the left and to the right from another iterator at the same time.

P.S. By the way, "x > x + 1" is quite possible, say for x == INT_MAX. That's quite different from "x - (x + 1)", which may be equal to 1 with defined overflow semantics (in C standard, signed integer overflow is undefined behavior, but in analyzer we assume defined overflow semantics, because a lot of people rely on them anyway on their platforms, and we shouldn't tell them to fix code that works).