Reusing CompilerInstance (parsing separate expressions)

Hi. I'm using clang for my own programming language as a C header
importer. The problem is that C programmers tend to use preprocessor
for defining constants in their libraries instead of enums and in order
to import that information from C header I need to parse a macro
definition and see if it's a constant expression and can be evaluated
to something.

It's easy to accomplish it if the macro is something simple like:
  #define BYTE_MAX 0xFF

in that case I can simply take that single token and feed it to
NumericLiteralParser, but when it comes to bit masks and various
complex expressions like:
  #define BIT1 (1 << 0)
  #define BIT2 (1 << 1)
  #define BIT3 (1 << 2)

I need to use a complete parser here. So, I simply mocking up a
function with a single expression statement and then feed it to
ParseAST using previously created CompilerInstance. And using custom
ASTConsumer I find my expression and do Evaluate.

(And all that achieved using PPCallbacks of course).

It all works just fine, I tested it on SDL/SDL.h, which contains about
700 numeric definitions (including all these garbage like
HAVE_STDIO_H). And about 350 of them are complex (more than one token).

Anyways, the problem is that it is slow. Creating a new
CompilerInstance each time and parsing just a single expression is
inefficient I guess. It takes about 400ms for SDL/SDL.h (without such a
constant macro extractor it takes about 60ms for SDL/SDL.h and all its
declarations). Applying hack that checks out if macro contains a single
token and the token is tok::numeric_constant gives speed up to about
260ms for full SDL/SDL.h parsing.

The question is: is it possible to reuse CompilerInstance in that
scenario? Or maybe there is another good way to accomplish what I'm
trying to do?

That's how my MacroDefined hook looks like:

void MacroDefined(const Token &name, const MacroInfo *mi)
  // we support only zero arg macros, which are constant
  // definitions, most likely
  if (mi->getNumArgs() != 0)

  if (mi->tokens_empty())

  // for some reason mi->isBuiltinMacro() doesn't work
  if (strcmp("<built-in>", srcm->getPresumedLoc(mi->getDefinitionLoc()).getFilename()) == 0)

  if (mi->getNumTokens() == 1 && mi->tokens_begin()->getKind() == tok::numeric_constant) {
    // TODO: use NumericLiteralParser
    const Token *tok = mi->tokens_begin();
    llvm::StringRef nm = name.getIdentifierInfo()->getName();
    printf("const %.*s = %.*s;\n", nm.size(),,
           tok->getLength(), tok->getLiteralData());

  MacroInfo::tokens_iterator first, last;
  first = mi->tokens_begin();
  last = mi->tokens_end() - 1;

  SourceLocation beg = first->getLocation();
  SourceLocation end = last->getLocation();

  const char *s = srcm->getCharacterData(beg);
  const char *e = srcm->getCharacterData(end) + last->getLength();

  cppsprintf(&tunit, "void foo() { %.*s; }\n", e-s, s);

  CompilerInstance ci;
  ci.getTargetOpts().Triple = llvm::sys::getHostTriple();
  ci.createDiagnostics(0, 0);


  llvm::MemoryBuffer *mb = llvm::MemoryBuffer::getMemBuffer(tunit, "macrodef.c");

  ConstantExprExtractor consumer;
  consumer.ctx = &ci.getASTContext();

  ParseAST(ci.getPreprocessor(), &consumer, ci.getASTContext());

  llvm::SmallVector<char, 128> tmp;
  // done, let's see what we got here
  APValue &v =;
  switch (v.getKind()) {
  case APValue::Int:
  case APValue::Float:

  if (!tmp.empty()) {
    llvm::StringRef nm = name.getIdentifierInfo()->getName();
    printf("const %.*s = %.*s;\n", nm.size(),, tmp.size(), &tmp[0]);