How to execute clang front-end AST visitor to exract some information from source code without compiling the source code first

Hello all,

For my project I have develop a resursive AST visitor which reates the AST from a give source files and its included header files and extracts some information from source code such as declared function names and types, declared variable types and names and function calls. It takes a .cpp source file as input and parses the file and finds all the nescessary information I need to.

The thing is if a try to analyse a project such as LIBTIFF or FFMPEG, it gets me all sort of errors such as undeclared identifiers and missing header files.

Is there a way to execute this front-end tool without making it compile the code first? I assume the code is correclty syntaxed though. My front-end tool is called in tin this way:

// parse the command-line args passed to your code
CommonOptionsParser op(argc, argv);

// create a new Clang Tool instance (a LibTooling environment)
ClangTool Tool(op.getCompilations(), op.getSourcePathList());

// run the Clang Tool, creating a new FrontendAction (explained below)
int result = Tool.run(newFrontendActionFactory());

Is there a way? Maybe a compiler argument or something??

You should be using a compilation database with your tooling application to retrieve/use the matching compilation commands for your project, that should include any header search paths, etc. CMake can generate such a database, so if the project you’re trying to build has cmake support, this might be an option. Otherwise you may need to teach a build system about it.

(probably the most generic thing someone could build to help here would be a scan-build like build-interposition tool to make a compilation database from any project/build system. Not the most efficient, but would be handy as a fallback)

Hi,

Hi all,

As Miklos pointed out I am using Bear to create a compilation database to analyze my source code. This lead to much better results.

Although I still have some problems. Does anyone knows why this happens?

When I am executing using the compilation database I get this error on certain files:

Executing ASTvisitor …/ffmpeg-0.6/libavcodec/libxvid_rc.c
Skipping /home/andreas/Desktop/test_analysis/…/ffmpeg-0.6/libavcodec/libxvid_rc.c. Command line not found.

And does not extract anything from the file.

When am executing myAST visitor with the exact same file but without the compilation database it gives me the error:

‘xvid.h’ file not found

but it analyses the file and the results are mostly correct. The visitor skips enything which has to do with variable declarations of the xvid.h header.
“How can the command line not found” error can be fixed?

Thank you all.

@Georgiou, when you generate a compilation database, it’s better to validate it with other tools. you can run clang-check, clang-format just to see it’s valid. if it is and your tool still fails with include problems, it might be the reason mentioned in the tooling doc. the other thing i noticed, that you use relative path to the module. (instead of ../ffmpeg-0.6/libavcodec/libxvid_rc.c try /home/andreas/Desktop/ffmpeg-0.6/libavcodec/libxvid_rc.c)

@David, the build-interposition command is already in the Clang source repo. it’s in tools/scan-build-py/bin called intercept-build.

regards,

Laszlo

Hi all,

I did what you said and using the same compilation database I run clang-check and the tool gave me the same error.

Does it just trying to say that there is no “rule” for the specific file in the compilation database?

Hi all,

I did what you said and using the same compilation database I run clang-check and the tool gave me the same error.
Does it just trying to say that there is no “rule” for the specific file in the compilation database?

Furthermore something that I noticed is that some of my files are being analysed 2 times from my AST visitor and this is because of the compilation database. Is something I can do about it?
Something for example to avoid parsing 2 times the same file?

hi Andreas,

can you try to create (or modify the existing) compilation database which makes the tools happy? (both clang-check and your tool too.) the project documentation has a chapter to explain how that file should look like.

and if there any difference between the generated one and your hand written one, please post your findings to Bear issue tracker.

double AST visit can be caused by double entries in the compilation database. (you can check the file if that was the case or not.) it’s a bug in Bear if there are duplicated entries in your compilation database.

regards,

Laszlo

Dear Laszlo,

Thank you for the provided information. I will try it and I will report back to you all and also Bear issue tracker.

Hi all,

I have read the project documentation about the compilation databases and this is my exact case:

There can be multiple command objects for the same file, for example if the same source file is compiled with different configurations.

Except from deleting one entry out of the two since I just want semantic analysis of the source file and I dont really care about the way it will be compiled is there something else?