Clang-extdef-mapping and AST files

Hi,

I am working on implementing static analyzing into our product with CTU analysis. To do this properly we first have to dump ASTs of the sources with -emit-ast and then run clang-extdef-mapping on the source files before we can start the analysis. Both these processes can be pretty slow - so I am looking into any way to eliminate or reduce one of the steps.

I have two thoughts:

  • Maybe clang-extdef-mapping could process the AST files directly - I think that would lead to less processing since the front-end already have done most of the heavy parsing. This would also be a natural way to handle the file name in the mapping file (i.e. not have to replace .cpp with .ast).
  • Maybe we could output the mapping file from clang when creating the ast file, all parsing should already be done and we don’t have to do it again in this case.

Any thoughts on these changes? I’ll also appriciate any pointers to the API’s for opening the .ast files and getting the AST loaded?

Btw - I know about CodeChecker, and I’ll use the reporting tool of that, but I can’t integrate that into our build setup.

Hi Tobias!

I believe you could fuse/chain the different AST consuming parts.
I’m not really an ASTImporter expert, but from my point of view, each of these tools is just ASTConsumers.
There are a couple interesting files for you:

  • clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp, especially the CrossTranslationUnitContext member. HandleTopLevelDecl is AFAIK the callback provided by the ASTConsumer class which is likely important as well.
  • clang/include/clang/AST/ASTConsumer.h
  • clang/include/clang/CrossTU/CrossTranslationUnit.h
  • clang/tools/clang-extdef-mapping/ClangExtDefMapGen.cpp
  • clang/include/clang/Frontend/FrontendActions.h
  • clang/include/clang/Frontend/MultiplexConsumer.h This one looks really interesting, grep for usages to see if it does the right thing. GeneratePCHAction is one example.

Btw - I know about CodeChecker, and I’ll use the reporting tool of that, but I can’t integrate that into our build setup.

Sad. What’s the blocker for that? Folks there might be interested in fixing that.

Hi @steakhal ! Thanks for the pointers.

I was able to make clang-extdef-mapping binary take a .ast file as input with some not so nice hacks. But it seems to be very very beneficial - I tested this on llvm/lib/AsmParser/Parser.cpp (just a random pretty big file) and running clang-extdef-mapping on the cpp file took around 5.5s. While running it on the generated .ast file took just 2s.

I think even faster will be to generate the ast and the mapping file at the same time, but this would require to add this to the clang frontend with a new option.

In any case I think the mapping generation from AST files is something we could accept in any case, since it’s pretty natural to run it on the .ast file and it’s so much faster. I will prepare a patch during the week and upload it. Is there any other reviewers I should add to that diff except you?

Regarding CodeChecker - it’s very broken on Windows, I have had to patch it locally just to be able to upload reports to a Linux server. Very many assumptions around the paths that doesn’t work on Windows.

Uploaded a diff here: :gear: D128704 [clang-extdef-mapping] Directly process .ast files (llvm.org)