since I don't have any prior knowledge about clang ,
> do i need to go through any other tutorials to
> completely understand the code of various experimental
> checkers and also to write one of my own?
Emm, no, we don't yet have a single good tutorial for everything. Some useful reading includes:
- lib/StaticAnalyzer/README.txt is a veeery brief introduction.
- The link [2] from lib/StaticAnalyzer/README.txt is a good detailed description of the memory model (MemRegion class hierarchy).
- See docs/analyzer/IPA.txt for a quick introduction to how inter-procedural analysis works.
- See docs/analyzer/RegionStore.txt for a shorter introduction to the memory model, with some implementation caveeats.
You may want to get familiar with the clang abstract syntax tree, to just know how clang represents types etc., the good video is there: http://clang.llvm.org/docs/IntroductionToTheClangAST.html .
Also, checker code is usually relatively simple. And the API is also relatively easy and intuitive - well, in most places. Just dump things often - or read the exploded graphs - and try to understand what's going on. Learning by example is what everybody does, i guess, even though not all examples are as good as i wish they were.
> Another specific question I have is that , suppose i
> have a statement var = read_value() . can I directly
> add read_value function to be one of the taint sources
> by adding a line in addSourcesPost function of
> GenericTaintChecker ?
It should work. Though if you want to share your work later, then probably it'd be inconvenient to have very specific functions in the generic tain checker, and we'd have to think how to separate them.
> And after changing the file , do i need to
> necessarily run 'make clang' inside build directory
> or is there any simple way to reflect the changes
> ,since the former takes way too much time.
You do. There are some usual tricks to speed up compilation - use the shared libraries option, use a faster compiler (clang?), use a faster linker (gold?), maybe use a release build if you don't want to have a debugger. Try to reduce the number of linkers running in parallel, otherwise they may eat up all the RAM and begin to swap.
For developing new analyzer checkers, there's one more option: load them as a clang plugin (eg. 'clang -cc1 -load checker.so <...>'), see examples/analyzer-plugin/ for an example. In this case you don't need to rebuild clang, just the checker, but running becomes a bit more tricky - not sure if, say, the scan-build script supports this method.
So probably it's a good idea for you to copy GenericTaintChecker, change it to a plugin, and go ahead extending it.