Dependency Info from AST

Hi,
Can anyone help if for example we have multiple c files and we want to analyze their ASTs to find out any dependency between them ( like a function defined in one c file is used in other, any external variable etc ). How can we do it at the AST level? I mean is there any automated tool or flag for it in LLVM. If anyone has any idea please tell.
sincerely,
Siddharth

(Apologies for the re-send Siddharth, I failed to cc the list)

There is no existing tool that I'm aware of which performs this analysis on the AST. It is possible to do on an AST. You would just need to write an AST Visitor that finds declarations, definitions, and uses of functions.

Is there a reason you need to do this at the AST level? With C code this analysis can be trivially performed with `nm` on the object files, so if you don't have a strong reason for needing to do this on the AST I'd just write a script to wrap `nm` rather than writing an AST Visitor.

-Chris

Hi Chris,
Thanks for the suggestion i was trying to use the patch https://reviews.llvm.org/D30691 and ASTimporter concept for cross file analysis. Can u explain in details the approach u suggested ? What is nm on the object files ? Can u suggest some approach to start on this cross file analysis tool.
Thanks,
Siddharth

Hi Chris,
Thanks for the suggestion i was trying to use the patch https://reviews.llvm.org/D30691 and ASTimporter concept for cross file analysis. Can u explain in details the approach u suggested ?

There is a tutorial on ASTVisitors here:
http://clang.llvm.org/docs/RAVFrontendAction.html

What is nm on the object files ?

nm is a standard unix tool:
https://en.wikipedia.org/wiki/Nm_%28Unix%29

-Chris

It really depends on what you are actually trying to achieve, but nm will give you a list of function & variable names that are declared or external to a particular file.

If I do nm on a simple test-file here, I get:

U fflush
0000000000000000 T main
0000000000000000 D max_source
U printf
U puts
U stdout

This tells me that this file declares a function called main (T = Text = Code = Function) and a variable (D = Data = Variable) called max_source. It also uses fflush, printf, puts and stdout from somewhere else (in this case, I happen to know these are all in the C library).

Let’s say my file also had a call to func_1 with U in front, it would mean that a (non-standard library function, since func_1 is not the name of any standard library function) is in a different file. You may then have to do nm func.o to find that this function is declared in func.c (assuming that’s what produced func.o of course).

To get C++ names into human readable form, you probably want to run the output from nm into c++filt (it demangles the C++ mangled name into something like func_1(int) instead of _Z6func_1i or some whatever the mangled form of func_1 would be.

Putting this together to show which source produces which functions and variables, you’d want a bit of Python code, using a dictionary (or write it in C++ and use std::map if you prefer - my Python skills are low, so I may well have opted for that method).

The point about this is that you can put this together with about 15-20 lines of code, rather than a rather complex ASTVisitor that does about the same thing (of course, if you want to build a proper cross referencing tool, that lists that func_1 is declared on line 4231 in func.c, and that it references the global variable x 13 times, 5 of which are writes, 8 of which are reads), then you can’t do that using nm)

So we are primarily focusing on improving clang SA capability to do cross file analysis, but with nm analysis on object file can we have the clang SA with it for analysis. Our aim is to find bugs or security vulnerabilities across files with the help of clang SA.
Thanks,
Siddharth