Serialize/Deserialize AST without original headers.

I am trying to implement a big set of API + overloads around 10k instructions in a library. I want to minimize the parsing cost, so my approach was to pre generate an ast of my library using clang.

clang -x c++ -emit-ast mylibrary.h -o redistributable.ast

I plan to make this binary file redistributable along with my aplication.


When I am loading the files from a different machine, it no longer works, since I think the ast binary is linked to the original sources that are no longer redistributed. Is there any way to avoid this link?

Any sugestions or path how to do this.

Bellow is code that I am trying.

This is how I currently load the ast and currently works on the same machine.

#include “” // this has the ast in static char ast[]
CompilerInstance CI;
CI.createDiagnostics(new TextDiagnosticPrinter(m_Context.GetDiag(), new DiagnosticOptions()), true);

StringRef theString(ast, sizeof(ast));

// Import AST from file.
llvm::MemoryBuffer *theBuffer = llvm::MemoryBuffer::getMemBuffer(theString, “precompiled.ast”, false);
llvm::OwningPtrclang::ASTUnit CUnit(ASTUnit::LoadFromASTMemoryBuffer(“hlsl.ast”, &ci.getDiagnostics(), CI.getFileSystemOpts(), theBuffer));

assert(CUnit.get() != nullptr && “Failed to load precomputed AST”);

ASTImporter Importer(ci.getASTContext(), ci.getFileManager(), CUnit->getASTContext(), CUnit->getFileManager(), false);

TranslationUnitDecl *TU = CUnit->getASTContext().getTranslationUnitDecl();
for (DeclContext::decl_iterator D = TU->decls_begin(), DEnd = TU->decls_end(); D != DEnd; ++D)
// Don’t re-import __va_list_tag, __builtin_va_list.
if (NamedDecl *ND = dyn_cast(*D))
if (IdentifierInfo *II = ND->getIdentifier())
if (II->isStr("__va_list_tag") || II->isStr("__builtin_va_list"))


I am trying experimentally this code to write the ast, but no longer is recognized by my load method, the previous load method only works for the ast generated by clang.

CompilerInstance ci;

ci.createDiagnostics(new TextDiagnosticPrinter(m_Context.GetDiag(), new DiagnosticOptions()), true);
ci.getLangOpts().C11 = true;
ci.getLangOpts().CPlusPlus11 = true;
ci.getLangOpts().CPlusPlus = true;

ci.getTargetOpts().Triple = llvm::sys::getDefaultTargetTriple();
ci.setTarget(TargetInfo::CreateTargetInfo(ci.getDiagnostics(), &ci.getTargetOpts()));
ci.getPreprocessor().getBuiltinInfo().InitializeBuiltins(ci.getPreprocessor().getIdentifierTable(), ci.getLangOpts());

llvm::StringRef Code(m_Context.pSource, m_Context.SourceSize);

llvm::MemoryBuffer *pSourceBuffer = llvm::MemoryBuffer::getMemBufferCopy(Code);


clang::CodeGenOptions codeGenOptions;

llvm::OwningPtrclang::CodeGenerator CG;
assert(CG.get() != nullptr && “could not create CodeGenerator”);


ci.createSema(clang::TranslationUnitKind::TU_Complete, nullptr);

ci.getDiagnosticClient().BeginSourceFile(ci.getPreprocessor().getLangOpts(), &ci.getPreprocessor());

clang::ParseAST(ci.getPreprocessor(), &ci.getASTConsumer(), ci.getASTContext());

llvm::BitstreamWriter streamWriter(astBytes);
ASTWriter astWriter(streamWriter);

astWriter.WriteAST(ci.getSema(), std::string(), nullptr, “”);

This is by design and will be probably very hard to change. Not only
Clang wants to print diagnostics referencing lines from the original
header file (obviously), but sometimes it needs to re-tokenize
something even when there are no errors, as far as I remember.