Also I found this paper, “A Memory Model for Static Analysis of C Programs”.
We’ve recently gave another talk on the static analyzer. See “Building a checker in 24 hours” at http://llvm.org/devmtg/2012-11/ The talk is targeted at someone writing a new check extension to the analyzer so it only has a very high level description of the internals; but I am sure you will find it useful.
The Front-end and the memory model are quite attractive, so I want to build
our work based on them.
The website says the engine do symbolic execution. I think it can be
modified for abstract interpretation.
Hence, I need some advice on where should I hack to achieve followings.
(i) Apply another algorithm to traverse CFG for finding “fixed-point”.
The ExprEngine/CoreEngine are responsible for walking the CFG. We unroll loops several times by default, but that is configurable. We stop when reaching the fixed point, by caching out on a previously seen state.
Take a look at AnalysesConsumer.cpp, which initiates the traversal.
(ii) Specify “abstract domain” which to my knowledge is similar to symbolic
value and constraints.
ConstraintManager/SimpleConstraintManager/RangeConstraintManager is the module responsible for keeping track of constraints on the symbolic values. I would suggest taking a look at it to see if you could implement an abstract domain by replacing/creating your own the ConstraintManager. (It’s hard to say if that would be enough/relevant without the details about the abstract domain.)
(iii) Define state and transfer function based on the domain.
If you are only interested in domain specific transfer functions, you should be able to implement them by writing a “checker”. (See the talk, which I’ve referenced above.) That would be the least intrusive way of using/interacting with the analyzer. I’d recommend writing a checker to get a feeling of how the analyzer works as well.
While elements of the symbolic execution employ concepts from abstract interpretation, the analyzer engine itself does not do a fix point computation to compute the set of possible abstract states at a given program point. Building that kind of solver is completely doable, but a very different approach than the one we tool. If you are interested in just local analysis on a single CFG, the elements of the CFG can serve as program points, and a worklist algorithm can be used to do the fix point analysis. Of course you will need to define widening operations and the like, but that depends completely on the domain that you want to define.
These papers are still relevant, although they there exact methodology is slightly different from what the analyzer does. The idea of path-sensitive dataflow analysis is a fairly generic one, and it can be tackled with different specific techniques, but they all boil down to trying to compute a set of reachable program states.
Another good paper is:
A System and Language for Building System-Specific, Static Analyses (Hallem et al)
and the more theoretical paper that talks about path-sensitive analysis as a graph reachability problem (not specific to finding bugs) is:
Precise interprocedural dataflow analysis via graph reachability (Reps et al)
There are tons of other papers in this area, and the analyzer is inspired by many of them.