Getting started with source code analysis using CLang

Hello,

I'm looking at various bits of CLang code, trying to figure out the very first steps needed to:

1) parse a source file into an internal representation suitable for static code analysis.
2) navigate that representation (via CIndex, as Doug suggested).

Can anyone give me some hints as to what specific API to use for this ?

Is there some documentation for this that I have overlooked ?

Thanks,

         Stefan

Hi Stefan,

The Parser class is probably what you want, it produces an AST. You can look at the existing Frontend to see how its constructed & called. You can also start clang with a trivial C program and set breakpoints to find out what code paths are actually taken.

As far as I can tell, there isn't a lot of documentation outside of the code itself. But the code is very well written, so it should be easy to figure it out.

Best,
Martin

Hello,

I'm looking at various bits of CLang code, trying to figure out the very
first steps needed to:

1) parse a source file into an internal representation suitable for
static code analysis.
2) navigate that representation (via CIndex, as Doug suggested).

CIndex can do both of these. clang_createTranslationUnitFromSourceFile(), part of the CIndex library, makes it easy to parse a source file into an AST.

Can anyone give me some hints as to what specific API to use for this ?

clang_createTranslationUnitFromSourceFile() is the main entry point; clang_visitChildren() will let you walk the AST. That's most of it!

Is there some documentation for this that I have overlooked ?

CIndex is documented here:

  http://clang.llvm.org/doxygen/group__CINDEX.html

there's a little c-index-test program in Clang's source tree that shows how to load a source file and walk the AST, printing some information about the entities reached.

  - Doug