Reusing CompilerInstance (parsing separate expressions)

Hi. I'm using clang for my own programming language as a C header
importer. The problem is that C programmers tend to use preprocessor
for defining constants in their libraries instead of enums and in order
to import that information from C header I need to parse a macro
definition and see if it's a constant expression and can be evaluated
to something.

It's easy to accomplish it if the macro is something simple like:
  #define BYTE_MAX 0xFF

in that case I can simply take that single token and feed it to
NumericLiteralParser, but when it comes to bit masks and various
complex expressions like:
  #define BIT1 (1 << 0)
  #define BIT2 (1 << 1)
  #define BIT3 (1 << 2)

I need to use a complete parser here. So, I simply mocking up a
function with a single expression statement and then feed it to
ParseAST using previously created CompilerInstance. And using custom
ASTConsumer I find my expression and do Evaluate.

(And all that achieved using PPCallbacks of course).

It all works just fine, I tested it on SDL/SDL.h, which contains about
700 numeric definitions (including all these garbage like
HAVE_STDIO_H). And about 350 of them are complex (more than one token).

Anyways, the problem is that it is slow. Creating a new
CompilerInstance each time and parsing just a single expression is
inefficient I guess. It takes about 400ms for SDL/SDL.h (without such a
constant macro extractor it takes about 60ms for SDL/SDL.h and all its
declarations). Applying hack that checks out if macro contains a single
token and the token is tok::numeric_constant gives speed up to about
260ms for full SDL/SDL.h parsing.

The question is: is it possible to reuse CompilerInstance in that
scenario? Or maybe there is another good way to accomplish what I'm
trying to do?

That's how my MacroDefined hook looks like:

void MacroDefined(const Token &name, const MacroInfo *mi)
{
  // we support only zero arg macros, which are constant
  // definitions, most likely
  if (mi->getNumArgs() != 0)
    return;

  if (mi->tokens_empty())
    return;

  // for some reason mi->isBuiltinMacro() doesn't work
  if (strcmp("<built-in>", srcm->getPresumedLoc(mi->getDefinitionLoc()).getFilename()) == 0)
    return;

  if (mi->getNumTokens() == 1 && mi->tokens_begin()->getKind() == tok::numeric_constant) {
    // TODO: use NumericLiteralParser
    const Token *tok = mi->tokens_begin();
    llvm::StringRef nm = name.getIdentifierInfo()->getName();
    printf("const %.*s = %.*s;\n", nm.size(), nm.data(),
           tok->getLength(), tok->getLiteralData());
    return;
  }

  MacroInfo::tokens_iterator first, last;
  first = mi->tokens_begin();
  last = mi->tokens_end() - 1;

  SourceLocation beg = first->getLocation();
  SourceLocation end = last->getLocation();

  const char *s = srcm->getCharacterData(beg);
  const char *e = srcm->getCharacterData(end) + last->getLength();

  tunit.clear();
  cppsprintf(&tunit, "void foo() { %.*s; }\n", e-s, s);

  CompilerInstance ci;
  ci.getTargetOpts().Triple = llvm::sys::getHostTriple();
  ci.createDiagnostics(0, 0);
  ci.setTarget(TargetInfo::CreateTargetInfo(ci.getDiagnostics(),
              ci.getTargetOpts()));
  ci.getDiagnostics().setSuppressAllDiagnostics();

  ci.createFileManager();
  ci.createSourceManager(ci.getFileManager());
  ci.createPreprocessor();
  ci.createASTContext();

  llvm::MemoryBuffer *mb = llvm::MemoryBuffer::getMemBuffer(tunit, "macrodef.c");
  ci.getSourceManager().createMainFileIDForMemBuffer(mb);

  ConstantExprExtractor consumer;
  consumer.ctx = &ci.getASTContext();

  ParseAST(ci.getPreprocessor(), &consumer, ci.getASTContext());

  llvm::SmallVector<char, 128> tmp;
  // done, let's see what we got here
  APValue &v = consumer.er.Val;
  switch (v.getKind()) {
  case APValue::Int:
    v.getInt().toString(tmp);
    break;
  case APValue::Float:
    v.getFloat().toString(tmp);
    break;
  default:
    break;
  }

  if (!tmp.empty()) {
    llvm::StringRef nm = name.getIdentifierInfo()->getName();
    printf("const %.*s = %.*s;\n", nm.size(), nm.data(), tmp.size(), &tmp[0]);
  }
}