Saving+loading the Clang AST

It’s a long story, but I’d like to dump the ASTs for all of the C files in a build and then run a post-processing program on all of the ASTs at once. In the end it’s sort of like a libTooling tool, except 1) it runs on multiple files at once, and 2) ideally for performance and consistency with the build process, it would delegate the parsing to the initial dumping process and not do it every time we run the post-processing program.

My approach has been to instrument the build process with a custom clang plugin that uses a Serialization/ASTWriter to dump the AST to a file. This process seems to be working well, but I’ve had a lot of issues trying to read the AST back out.

Namely, that serialization code seems to be pretty geared towards the needs of the PCH save+load process and is pretty hard to use. I’ve done the work of setting up a special CompilerInstance just for loading the AST, but I keep getting “configuration mismatch” errors. I assume I need to separately persist the command line flags and be more careful to set up the CompilerInstance the same way that the dumping program does it, but it’s already been a lot of source diving and I’m not sure this is even a supported use of these classes.

My question is – is this the right way to be saving+loading the clang AST? It’s looking like it’s going to be easier to define a new, pared-down format than use ASTWriter and Reader (and will be nicer for consumers by not having to pull in much of clang), but I’ve read that you’re supposed to be able to use the clang AST like this.

Alternatively, if I should use a different approach that would be good information too. All of the projects I’ve seen parse things on demand and then I assume have issues making sure all the flags are correct.

Thanks,
Kevin