I'm looking for a skeleton C parser. I've done C parsers before and I know to do a real parse, you need to have a symbol table so that you can keep track of typedefs.

I'm wanting to write a tool that will take existing C source (after the preprocessor) and, for example, tell me when a particular function is called (what source line and from within what function) and tell me the arguments to it. I don't need a full cross reference. And, in fact, my particular needs today are just a sample. I'm looking for a parser that is easy to tie in to.

With the standard Ruby source, there is something called "ripper" which they call an "event-based style parser" (for Ruby). It has a number of hooks that are called as the source is parsed -- e.g. start of function, start of statement, etc. This is really easy to tie into. This is one possibility. The other possibility is for the parser to give me back a parse tree (or sequence of trees perhaps).

I'm willing to use most any language but assumed it would be C based. I'm comfortable using Ruby, Python, Perl, etc if necessary. I can also work with YACC / Bison if necessary.

I'm hoping this project would have either exactly what I'm looking for or major pieces to help me build what I'm looking for.

Using clang's libraries, you can generate a complete AST for C code,
find all the call expressions, and do whatever analysis of them is
necessary. See http://clang.llvm.org/ . Please direct further
questions to the cfe-dev mailing list.


I got clang working. I put it on my Mac. I see the -S option and I see that -emit-ast spews out some binary stuff. I've also found a page describing how to read it back in but I'm assuming that somewhere is a utility to pretty print the AST (or perhaps it is built right into clang).

Try writing a Clang plugin. There’s an example plugin that prints all function names in examples/PrintFunctionNames.

You will want to be familiar with what Clang’s AST looks like. For a view of the inheritance hierarchy for the AST nodes, see here:

btw: Clang’s AST hierarchy has two roots, Decl and Stmt (that confused me at first).

try clang -cc1 main.c -ast-dump

No, that won’t find includes. It will be more useful to run
clang main.c -Xclang -ast-dump

The -Xclang passes the next argument to the “-cc1” invocation of clang. (warning, the dump will probably be huge, but you can grep and such to get just the parts you want). Btw I didn’t know that you could trigger the ast dump from the command line; awesome!

