Path constraints at function call in checker

Hello,

Is there a simple way to enumerate all of the constrained symbolic values (and constraints) required to reach a function call (where the call is outside the current translation unit) via symbolic execution in a Clang checker? For example, I'd like to get information about x, y and z at in this case:

int x = read();
int y = read();
int z = read();
if (x > 1 && y > 2 && z > 3) {
  foo(x);
}

The desired information would be something roughly analogous to { (x, [2, 2147483647]), (y, [3, 2147483647]), (z, [4, 2147483647]) }. I've tried a few different methods to elicit this information in the PreCall and PostCall hooks, but it seems that the ConstraintManager only is tracking constraints on the arguments to `foo`.

I've reviewed the available documentation and re-watched the "Building a Checker in 24 Hours" talk (which was great BTW). I've also taken a look at some of the existing checkers to see how I might inform the execution to track constraints on these values, but nothing is jumping out at me.

I know there are some advertised limitations on symbolic execution in the checker framework but it seems like this should be possible. I'm new to clang internals so pardon if I missed something obvious.

Thanks,
Jon

The desired information would be something

> roughly analogous to { (x, [2, 2147483647]),
> (y, [3, 2147483647]), (z, [4, 2147483647]) }.

The easiest way to *have a look* at these constraints is to dump() the program state and have them all printed out.

On the other hand, no matter how natural it is for the symbolic execution paradigm to be all about symbols and constraints, the analyzer doesn't provide a way to enumerate and act upon these constraints directly from, say, checker code.

The reason for this limitation is that the exact shape and contents of these constraints are defined by a replacable module - that implements the ConstraintManager interface - and different implementations of a ConstraintManager would store different information about symbols. The default implementation is called RangeConstraintManager. It can probably be replaced with a full-featured SAT solver, which would present the symbol range information in a completely different manner.

So if you want to obtain such information, then you probably have no choice but to add a pure-virtual method into ConstraintManager for obtaining all the info you need, and then implement it in RangeConstraintManager. Such method would be similar to print(). Ranges stored directly in RangeConstraintManager's program state trait are exactly what you need. If you're looking into printing the information to the user in a friendly manner, SValExplainer may come out handy for pretty-printing the symbols themselves.

I think the issue here is that ‘y’ and ‘z’ are dead at the call to foo(). The analyzer is very aggressive about removing “dead” symbols that are no longer reachable. Because y and z are not read after the guard condition, the analyzer will remove both their bindings in the store and the constraints on the symbols they are bound to before the call to foo(). You should still be able to reconstruct the path condition by traversing the exploded node graph backward and looking for changes in constraints between a node and one if its immediate predecessors. There is code in BugReporterVisitors.cpp that already does something similar for the purpose of adding helpful path notes to diagnostics (TrackConstraintBRVisitor::VisitNode) — you may be able to adapt it for your needs.

Devin