Can anyone help if for example we have multiple c files and we want to analyze their ASTs to find out any dependency between them ( like a function defined in one c file is used in other, any external variable etc ). How can we do it at the AST level? I mean is there any automated tool or flag for it in LLVM. If anyone has any idea please tell.
(Apologies for the re-send Siddharth, I failed to cc the list)
There is no existing tool that I'm aware of which performs this analysis on the AST. It is possible to do on an AST. You would just need to write an AST Visitor that finds declarations, definitions, and uses of functions.
Is there a reason you need to do this at the AST level? With C code this analysis can be trivially performed with `nm` on the object files, so if you don't have a strong reason for needing to do this on the AST I'd just write a script to wrap `nm` rather than writing an AST Visitor.
Thanks for the suggestion i was trying to use the patch https://reviews.llvm.org/D30691 and ASTimporter concept for cross file analysis. Can u explain in details the approach u suggested ? What is
nm on the object files ? Can u suggest some approach to start on this cross file analysis tool.
Thanks for the suggestion i was trying to use the patch https://reviews.llvm.org/D30691 and ASTimporter concept for cross file analysis. Can u explain in details the approach u suggested ?
There is a tutorial on ASTVisitors here:
nmon the object files ?
nm is a standard unix tool:
It really depends on what you are actually trying to achieve, but
nm will give you a list of function & variable names that are declared or external to a particular file.
If I do
nm on a simple test-file here, I get:
0000000000000000 T main
0000000000000000 D max_source
This tells me that this file declares a function called
main (T = Text = Code = Function) and a variable (D = Data = Variable) called
max_source. It also uses
stdout from somewhere else (in this case, I happen to know these are all in the C library).
Let’s say my file also had a call to
func_1 with U in front, it would mean that a (non-standard library function, since
func_1 is not the name of any standard library function) is in a different file. You may then have to do
nm func.o to find that this function is declared in
func.c (assuming that’s what produced
func.o of course).
To get C++ names into human readable form, you probably want to run the output from
c++filt (it demangles the C++ mangled name into something like
func_1(int) instead of
_Z6func_1i or some whatever the mangled form of func_1 would be.
Putting this together to show which source produces which functions and variables, you’d want a bit of Python code, using a dictionary (or write it in C++ and use std::map if you prefer - my Python skills are low, so I may well have opted for that method).
The point about this is that you can put this together with about 15-20 lines of code, rather than a rather complex ASTVisitor that does about the same thing (of course, if you want to build a proper cross referencing tool, that lists that
func_1 is declared on line 4231 in
func.c, and that it references the global variable
x 13 times, 5 of which are writes, 8 of which are reads), then you can’t do that using
So we are primarily focusing on improving clang SA capability to do cross file analysis, but with nm analysis on object file can we have the clang SA with it for analysis. Our aim is to find bugs or security vulnerabilities across files with the help of clang SA.