LLVM IR Question

I've just stumbled upon clang and have been reading the documentation
and such and I have a couple of questions that I'm hoping someone can
answer and save me some time.

I am playing around with converting C/C++ files into a format that is
very similar to the LLVM IR output however I still need some information
that I believe is only found in the AST. Can clang give me the ability
to iterate through the LLVM IR statements and still reference original
source lines in the AST?

If that is possible how exactly do I access this information? I've
looked at the clang-interpreter example and it would be similar to what
I thought of doing just the my interpreter would output my type of file.

Do I have to create my own CodeGenAction or something else? I've only
just started to look at the code.

Thanks,

Kyle

I've just stumbled upon clang and have been reading the documentation
and such and I have a couple of questions that I'm hoping someone can
answer and save me some time.

I am playing around with converting C/C++ files into a format that is
very similar to the LLVM IR output however I still need some information
that I believe is only found in the AST. Can clang give me the ability
to iterate through the LLVM IR statements and still reference original
source lines in the AST?

No, not really. However, you can add stuff to the Clang "codegen" library, which is the part responsible for building LLVM IR from the AST. This gives you access to both at the same time.

If that is possible how exactly do I access this information? I've
looked at the clang-interpreter example and it would be similar to what
I thought of doing just the my interpreter would output my type of file.

When debug information is enabled, clang does attach debug metadata to the instructions it generates. This only gets you file/line/col information though, not full AST mapping.

-Chris

To add to what Chris said:

LLVM IR now supports arbitrary metadata, so it is possible for you to modify Clang's CodeGen library to annotate particular bits of LLVM IR with information from the AST that isn't part of the debug info, if you want to.

In some cases, this information might be useful for other IR consumers. For example, we annotate GNU runtime ObjC message sends with various things that are used for optimisations later (e.g. whether it's a class message, the assumed class of the receiver). There are no general hooks for inserting this information, so you'll have to use a patched version of the codegen lib, or alternatively work directly with either the AST or IR.

David