Using Clang to parse headers, mangle names, lower C types

Hi cfe-dev,

I have a project where I’m interested in parsing header files to correlate some elements within compiled programs (without symbols). The programs might not necessarily be from the native/default Clang architecture (though I’m happy to bail out if Clang doesn’t know about it at all).

I’m linking directly against the Clang static libraries instead of libclang for this (since libclang doesn’t leak LLVM and I’m interested in the LLVM-lowered representation of C types).

Looking at the libclang code, I was able to parse headers without significant problems. This is what I think is relevent from the code, and you can essentially assume that anything not show here only has default values:

const char* clangParameters = {

true, // only local declarations
true, // capture diagnostics
false, // remapped files keep original name
0, // do not precompile preamble
false, // cache completion results
false, // include comments in code completion
false, // allow PCH with errors
true, // skip function bodies
true, // user files are volatile
false)); // for serialization

Problems start with name mangling: I develop on OS X and I’m interested in the mangled names of symbols on Linux right now. Not too surprisingly, this very vanilla Clang invocation mangles names for OS X (it prefixes C names with an underscore). (I create index::CodegenNameGenerator mangler(result->tu->getASTContext()), and then I get the name with mangler.getName(fn).) I tried adding the arguments “-triple”, “x86_64-pc-linux” to the invocation (without changing anything else) but it would still get me OS X mangling. Any pointers?

Finally, I’ll eventually want to lower Clang types to LLVM types. It looks like the CodeGenTypes class does that. Unfortunately, this class is not exposed in library headers, which leads me to believe that I’m not exactly supposed to go that route directly. Is there a good way to do it?