Data flow analysis in Clang

What API's are available for Dataflow Analysis in Clang? I want to create a
standalone tool (using libTooling) to analyse C source code.

You basically have the static analyzer’s CFG and Clang’s AST. With the AST you can do limited data flow analysis (as long as you don’t need path or control flow sensitivity), with the CFG you can do anything you want (at least in C) given enough computing power :stuck_out_tongue:

AST access is well-integrated in libtooling, CFG access less so (I haven’t written a tool that uses the CFG, but given that the CFG is used for Clang’s diagnostics, I’d imagine it’s not too hard to use from a libTooling based tool).

Looping in Anna & Jordan for more info on what’s possible with the CFG / static analyzer.

Cheers,
/Manuel

So assume that I want to find all live variables at the end of the block. Then how easy it is to do the job?

There’s already an analysis, LiveVariables.cpp, that does this. It works on top of the CFG, and does a reverse dataflow analysis (using a worklist algorithm) to compute liveness information for variables. This liveness information is currently consumed by the static analyzer to prune out redundant information from the path state.

AST access is well-integrated in libtooling, CFG access less so (I haven't written a tool
that uses the CFG, but given that the CFG is used for Clang's diagnostics, I'd imagine
it's not too hard to use from a libTooling based tool).

Will the application programming interface be improved for the efficient and safe analysis
of control flow graphs?

Regards,
Markus

Can you use this in the context of libTooling?

AST access is well-integrated in libtooling, CFG access less so (I haven’t written a tool
that uses the CFG, but given that the CFG is used for Clang’s diagnostics, I’d imagine
it’s not too hard to use from a libTooling based tool).

Will the application programming interface be improved for the efficient and safe analysis
of control flow graphs?

You can already do this, although it’s not made particularly easy yet - generally the static analyzer is just a bunch of frontend action - you can look at what ClangTidy does and how it interfaces with static analyzer checks (it basically glues all the stuff together)…

Can you use this in the context of libTooling?

Yep. Look at clang-tidy to see how you can hook up static analyzer checks to a clang tool (clang-tidy is written on top of libtooling).

The simples thing for now is probably make your check either a static analyzer check directly and use clang-tidy as a driver to run it, or implement a clang-tidy check (where you have access to the CFG, just like clang’s diagnostics, but not to the path sensitive analysis).

You can already do this, although it's not made particularly easy yet - generally the static analyzer
is just a bunch of frontend action - you can look at what ClangTidy does and how it interfaces with
static analyzer checks (it basically glues all the stuff together)...

How are the chances to make the corresponding data exchange easier?
Will any more dedicated base classes become available?

Regards,
Markus

You can already do this, although it’s not made particularly easy yet - generally the static analyzer
is just a bunch of frontend action - you can look at what ClangTidy does and how it interfaces with
static analyzer checks (it basically glues all the stuff together)…

How are the chances to make the corresponding data exchange easier?
Will any more dedicated base classes become available?

Data exchange about what?

Data exchange about what?

I am still missing interface descriptions around control flow graphs.
http://clang.llvm.org/extra/doxygen/annotated.html
http://clang.llvm.org/docs/LibTooling.html

Do I overlook any documentation (besides source code from the ClangTidy tool)?

Regards,
Markus

Data exchange about what?

I am still missing interface descriptions around control flow graphs.
http://clang.llvm.org/extra/doxygen/annotated.html
http://clang.llvm.org/docs/LibTooling.html

Do I overlook any documentation (besides source code from the ClangTidy tool)?

There is not much documentation about this apart from what the static analyzer itself has. For most of it look at the code:
http://reviews.llvm.org/diffusion/L/browse/cfe/trunk/include/clang/Analysis/CFG.h

Cheers,
/Manuel

Data exchange about what?

I am still missing interface descriptions around control flow graphs.
http://clang.llvm.org/extra/doxygen/annotated.html
http://clang.llvm.org/docs/LibTooling.html

Do I overlook any documentation (besides source code from the ClangTidy tool)?

There is not much documentation about this apart from what the static analyzer itself has. For most of it look at the code:
http://reviews.llvm.org/diffusion/L/browse/cfe/trunk/include/clang/Analysis/CFG.h

The clang CFG class and related analysis are defined in /include/clang/Analysis/. Note, these are not stable APIs that are guaranteed not to change (though this is true for most of clang/llvm). There are few users of these APIs. For example, LiveVariables and CFGReachabilityAnalysis are good examples of flow-sensitive analysis implemented on top of the clang CFG. The clang static analyzer is another consumer of the CFG; however, it performs path-sensitive static analysis.

Cheers,
Anna.