I am working on static symbolic analysis and came across Clang Static Analyzer.
However, I am interested only in getting the symbolic expressions statically and not the bug finding logic of static analyzer.
So far, I am able to view the graph (which show the value ranges for symbolic variables) using the following option:
clang -cc1 -analyze -analyzer-checker=core -analyzer-viz-egraph-graphviz test.c
Now, I want to get the actual symbolic expressions for these variables. How can I get it?
As I won’t be doing any bug analysis should I still write a checker just to access the analysis data, or is there any other way I can get the path symbolic expressions for all paths?
Also, what kind of solver does clang static analyzer use for path satisfiability?
Hello, Shweta. Currently, the static analyzer infrastructure isn’t set up to run without any bug reporting—the “core” checkers which check for things like “division by zero” and “null pointer dereference” are used both for generating warnings and for halting analysis along a certain path.
I’m not sure what you mean by “getting symbolic expressions statically”. The analyzer works in terms of symbolic and concrete values (SVals); the symbolic values have a typed representation with a basic hierarchical structure (SymExpr and SymbolData). In most parts of the analyzer, you don’t ask for the current value of a variable but rather construct an expression involving the variable and ask the current state (ProgramState) to evaluate it.
Our current constraint solver is fairly simple, keeping sets of possible (scalar) values for symbols and applying a set-union each time a new constraint is added. The implementation uses sets of ranges as a fairly compact way to represent an accumulation of constraints on integers. However, constraints on symbols are not the only way a path can be deemed infeasible; any checker can mark the path as such. This is useful for, say, exceptions, which involve a control flow transfer that we can’t follow without whole-program analysis.
Anna Zaks and I gave a talk at last year’s LLVM Developer Meeting, which you can watch here. It’s more aimed at checker writers, but the first part does contain a discussion of the high-level design of the analyzer and some of the important classes, including the ones I mentioned above.
Please feel free to e-mail us if you have specific questions about the static analyzer.
Thanks for your reply and sharing the talk details. It helped to get a better understanding of the static analyzer design.
In the talk it is mentioned that static analyzer collects the constraints on symbolic values along each path while building a graph of reachable program states.
I am trying to collect all the program points where a new constraint is added. In the end, I want to track where and what new constraints are being added for all variables in program. By this, I can construct a symbolic expression for symbolic variable.
To do this, I am considering the possibility of implementing a checker – just to access the program point and program state data stored in the node. This checker will not update the node state or generate a sink. Is it feasible? If a sink node is not added, will the analysis terminate?
Although the constraint solver doesn’t handle bitwise operations and multiple symbols, is there a way to just generate these constraints and then use a better solver, perhaps STP?
There is a checker callback for when constraints are added: checkAssume. However, the form in which you receive the assumption. will be the form it’s constructed in the source, which may not be the most conducive to actually tracking constraints being added. (Example: “x + 7 > 42”.) Deconstructing this properly would require duplicating a lot of the work inside the existing constraint solver.
One possible alternative would be to implement your own constraint solver, which would wrap or share an implementation with RangeConstraintManager. I’m not sure whether this will be useful for your purposes, however.
It’s interesting that you bring up STP in particular, because Ryan Govostes was working on a patch to add STP as an optional alternate constraint solver. Though it has some rough edges, it was basically working, though not quite integrated into the build system. It hasn’t gone in yet because Ryan’s been busy, and because we’ve been too busy to help in review. One consequence of this is that STP is much slower than our range-based constraint manager, so the cost of the analyzer’s upper-bound-exponential algorithm is much worse. Still, we’d like to get that in eventually for people who are willing to throw a lot of time at a problem in order to get better results.
Thanks for the reply!
I will use the checkAssume callback and check if I can manage to get what I want.
It will be great if STP is made available as an alternative in cases where precise results is the priority.
For now, I would like to check if the slowdown due to Ryan’s patch is significant or trivial for my purposes.
Can I get the patch for checking the slowdown on my tests?
Please find attached a preliminary version of my patch. I am curious to hear from you how well it works.
The STP-based constraint manager is implemented as a plug-in which lives in clang/tools/smt-constraint-manager/. There is a README file in that directory that explains how to build and use the plug-in. I have only tested it on OS X.
The README also details some caveats that will affect both accuracy and performance.
Note that with this patch applied, the static analyzer will perform worse even when the plug-in is not active, because I had to disable an optimization in SValBuilder::makeSymExprValNN. I also hope to solve this before submitting the patch, but please keep this in mind if you are planning on running benchmarks.
ento-stp-shweta.diff (39.6 KB)
Thanks a lot for the patch! I will report back after checking how it works for my analysis and benchmarks.
I have been trying to get your Clang SA STP patch to work on a local
fork of Clang (3.7, commit 3457cd5). After making a few changes, I got
it to compile on my Unix system (Ubuntu 12.04) which is great because
(according to the README) it hasn't been tested on non-Mac systems.
However, I run into issues when I invoke your plug in, like so:
clang -Xclang -load -Xclang
-analyzer-constraints=smt --analyze tmp.c
: CommandLine Error: Option 'debug' registered more than once!
fatal error: error in backend: inconsistency in registered CommandLine
clang-3.7: error: clang frontend command failed with exit code 70 (use
-v to see invocation)
clang version 3.7.0 (http://llvm.org/git/clang.git
Thread model: posix
clang-3.7: note: diagnostic msg: PLEASE submit a bug report to
http://llvm.org/bugs/ and include the crash backtrace, preprocessed
source, and associated run script.
clang-3.7: note: diagnostic msg: Error generating preprocessed source(s).
Any ideas what might be happening here. My patch attached. I am happy to
contribute to a future stp patch to clang sa!
0001-Experimental-stp-commit-that-compiles-but-doesn-t-ru.patch (41.4 KB)
I realized the previous patch is too big for review. The relevant diff
to get your STP patch to work on top of (clang 3457cd) is attached. Hope
clang3.7-smt.patch (10.7 KB)
Update: I got it to work in a hacky way. Biggest problem was figuring
out a build configuration in which minisat and stp libraries could be
statically linked into the SMTConstraintMgr plugin.