I'm interested in looking at detecting "known bad" patterns, for example:
(<expr> & 0) == 0 (this example is stolen from FindBugs)
Quick question. Did you literally want to look for this syntax in the AST (i.e., the integer literal '0') or did you want to use the dataflow analysis to see cases when an expression evaluates to '0' and then is bitwise-ANDed with <expr>? Is the motivation of this check is that it is trivially true?
if(<non-boolean value>) (cause of recent OpenSSL vuln)
This seems a little too vague to me. People check non-boolean values all the time in 'if' conditions. Were you thinking about cases when <non boolean value> was negative? Positive numbers seem completely legit to me (e.g., some flag was set in a bitmask, etc.).
In either case, both of these seem like "auditor" checks. For the first check if all you wanted to do is grep the syntax, we can easily add another check that gets called by AnalysisConsumer. Check out the functions "ActionXXX" in AnalysisConsumer.cpp to see all the places where the various checks are called. An ActionXXX can get called for each function, an entire Objective-C class, or the entire translation unit. For example:
static void ActionWarnDeadStores(AnalysisManager& mgr) {
if (LiveVariables* L = mgr.getLiveVariables()) {
BugReporter BR(mgr);
CheckDeadStores(*L, BR);
}
}
This action gets called for each function/method declaration. The action queries the AnalysisManager (the thing that runs the checks) for the live variables analysis result on the current function (which gets computed on-the-fly). The function 'CheckDeadStores' is then called with (a) the results of the live variables analysis and (b) and BugReporter object which is used to issue bug diagnostics to the user. Analyses get registered in the Analyses.def file, where the "scope" of the analysis is declaratively specified. For example:
ANALYSIS(WarnDeadStores, "warn-dead-stores",
"Warn about stores to dead variables", Code)
Here "Code" means "code declaration"; that is a FunctionDecl or an ObjCMethodDecl.
A syntactic check can just walk the AST. A good example of a syntactic check is CheckObjCDealloc.cpp. Although this check will one day be revamped to do more real analysis, you can see how it walks the AST using iterators and does various pattern matching. Syntactic checks are fairly easy to implement in this way and hook up to AnalysisConsumer. The down-side is that they can be fairly dumb if the property you are checking requires any real smarts.
Another possible way to implement an "auditor" is to use the GRAuditor interface defined in GRAuditor.h:
template <typename STATE>
class GRAuditor {
public:
typedef ExplodedNode<STATE> NodeTy;
typedef typename STATE::ManagerTy ManagerTy;
virtual ~GRAuditor() {}
virtual bool Audit(NodeTy* N, ManagerTy& M) = 0;
};
Checks that subclass GRAuditor<const GRState*> can be registered with a GRExprEngine and have their "Audit" method called after the path engine evaluates all expressions of a given type. The Auditor is then free to inspect the analysis state at that point and flag errors. A good example of this kind of check is in BasicObjCFoundationChecks.cpp. The class in question is AuditCFNumberCreate, which flags misuses of the function CFNumberCreate (http://developer.apple.com/documentation/CoreFOundation/Reference/CFNumberRef/Reference/reference.html). That function just interposes on the path engine at every CallExpr and checks the arguments in the function call for illegal values.
This is probably enough for one email. I'm more than happy to dig into more details depending on what direction you decide to go in. Also, these interfaces are obviously evolving. We haven't written many checks, so any idea of how to refactor the interface, make it better, etc., are welcome.