evaluating a LAnd expr in the static checker

I am trying to construct a constraint that says 0 <= Offset <= Size, for two SVals Offset and Size. Here’s what I did:

ASTContext &ACtx = ChCtx.getASTContext();

SValBuilder &SVBldr = St->getStateManager().getSValBuilder();

SVal Zero = SVBldr.makeZeroVal(ACtx.getSizeType());

SVal OffsetIsNonNeg = SVBldr.evalBinOpNN(St, BO_LE, Zero.castAs(), Offset.castAs(), ACtx.BoolTy);

SVal OffsetIsWithinSize = SVBldr.evalBinOpNN(St, BO_LE, Offset.castAs(), BuffSize.castAs(), ACtx.BoolTy);

SVal OffsetIsWithinBounds = SVBldr.evalBinOpNN(St, BO_LAnd, OffsetIsNonNeg.castAs(), OffsetIsWithinSize.castAs(), ACtx.BoolTy);

When I run this code, it fails with an error: Assertion `false && “Invalid Opcode.”’ failed. This is coming from evalAPSInt in BasicValueFactory.cpp. This function contains a comment that says “Land, LOr, Comma are handled specially by higher-level logic.” Land does not seem to be handled by higher-level logic in this case. I intend to work around this by asserting these as two separate constraints instead of combining them into one. But this looks like a bug to me, so thought I should raise the issue here.

Hello Raymond,

As I understand, you need to assume OffsetIsWithinBounds later. I suggest you to use another approach. Instead of making a "logical and", you can try to just "assume" operands of LAnd sequentially.

So, the first assume will be for OffsetIsNonNeg, and the next is for OffsetIsWithinSize.

04.06.2016 00:54, McDowell, Raymond C. via cfe-dev пишет:

For the kind of constraint you specified, ProgramStateRef <http://clang.llvm.org/doxygen/namespaceclang_1_1ento.html#a4e45a121820f0d80d8910093dd33a1df&gt;::assumeInBound\(\) method may also work well. <http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ProgramState.html#a046c4b1b381d1f5fed26eda3f0646efb&gt;

04.06.2016 20:39, Alexey Sidorin via cfe-dev пишет:

Logical And and Logical Or have consequences for control flow (they are short-circuiting) and so are represented in the control flow graph. You can see how they are evaluated in ExprEngine::VisitLogicalExpr(). This is the “higher-level logic” referred to in the comment.

As you noted, you can apply a constraint that is a logical conjunction of OffsetIsNonNeg and OffsetIsWithinSize by using ConstraintManager::assume() to first apply the assumption that OffsetIsNonNeg is true to the current state. Then you can call assume() again on the resultant state to add the second assumption that OffsetIsWithinSize is true.

This also reflects an overall design decision in the analyzer to keep constraints relatively simple so they can be reasoned about very quickly. Rather than track complex, logical symbolic constraints within a single state, the analyzer will eagerly split into multiple states — each of which has simpler, range-based constraints.

For example, if you have:

void foo(int a, int b, int c) {
  assert((a == 1 || b == 2) || c == 3);
  // (*)
}

Rather than produce a single state at (*) with the complex constraint “ ((‘$a’ equals 1 OR ‘$b’ equals ‘2’) OR ‘$c’ equals ‘3’)“ it will produce three separate states:

1)
$a: [1,1]

2)
$a: [INT_MIN, 0], [2,INT_MAX]
$b: [2,2]

3)
$a: [INT_MIN, 0], [2,INT_MAX]
$b: [INT_MIN, 1], [3,INT_MAX]
$c: [3,3]

The analyzer will then explore the path from these three states separately. In effect the analyzer has moved the disjunction for the complex constraint out to the state level.

This approach keeps the constraint manager simple and *very* fast, at the cost of eager state splits and imprecision when reasoning about relationships. In particular, the analyzer currently has no way to reason about symbolic constraints involving relationships between unknowns like "a <= b < c” — which would very useful for things like array-bounds checking!

Ryan Govostes and Bhargava Shastry have been working on improving the analyzer’s reasoning about symbolic relational constraints with an SMT solver (STP, in this case). You can see their work at <http://reviews.llvm.org/D15609&gt;\.

Devin

Hello Devin,

I think your explanation is great, and that's what i thought when i first stepped on this constraint myself. But i also think that we could provide some useful features and improve user experience here.

I mean, it might be possible that the user (checker author) has a couple of SVal's and wants to conduct a "mathematical" logical-and/or on them, and doesn't care that in some programming languages (even if it's all the languages we support) there's a short-circuit rule. And we can support the user - constant-fold the resulting value if it constant-folds, perhaps construct a symbolic expression if it doesn't, wait until this expression gets alpha-renamed to something more sensible (when things start getting alpha-renamed). Probably, if the user constructs an LAnd (unlike LOr), and pass it into assume(), we could perform two sequential assumes under the hood. We just need to make sure we don't do that in ExprEngine.

We may change the name for the operation to make the distinction more clear and warn the user that what he's doing is different from his normal LAnd/LOr expression (in fact, we could provide a separate enum for operation kinds, instead of re-using the BO_*, and that'd demonstrate explicitly what operations we do support and how operations on symbols are different from AST expressions).

In fact, i've got a few more ideas about custom operations, eg. "bignum-arithmetic-less-than" (so that assumption "$x + 1 <= $x" resulted in a null state without overflows, but $x still mapped into a sensible integral type in the AST context) - sounds pretty questionable, but such things make vague sense when i try to imagine how summary-ipa might be automated with ghost-variable-based mutex-stacks from pthread lock checker (there's no problem to work around lack of such operations, but it might be a lines-of-code-saving feature).