LLVM bindings, scope of llvm-c

Hi all

Last week, I performed some experiments using the python bindings to LLVM (http://code.google.com/p/llvm-py/). The goal of these experiments was to evaluate the usability of LLVM's scripting language bindings for code analysis and transformations.

I found that the current version of the python bindings allows for loading, generating, JIT compiling and executing LLVM IR. However, it seems to me that the functionality for analyzing and transforming LLVM IR is still very limited. For example, it is possible to iterate over all instructions in a basic block, but I cannot find a way to get information about instructions themselves, for example getting the opcode, the operands etc.

Digging deeper into the bindings and llvm-c which is the base for the llvm-py bindings, revealed that this functionality is currently not present in llvm-c and thus most of the functions in Instruction.h/User.h/etc cannot be exposed to bindings based on llvm-c.

Evidently, llvm-c could be extended to expose more details of the LLVM IR which would allow more powerful program analysis and transformations. But maybe this negates the design goals for llvm-c.

- Is llvm-c meant to mirror are larger fraction of the C++ interface to language bindings?
- Or is the functionality provided by llvm-c kept intentionally simple for some reason?

If the latter is true, will this mean that llvm-c (and scripting language bindings using this interface) are not meant to build sophisticated analysis and transformation functions?

I would appreciate, if someone could shed some light on the intended feature set for llvm-c.

Best regards,
  Christian

Disclaimer: I'm still new to LLVM and the scripting language interfaces in particular. It might be that I'm getting something wrong or that I'm talking nonsense.

There could just as easily be language bindings between C++ and
Python, he chose the C bindings probably just because they were
easier, but they are most certainly not required. I know that is the
case in a few other scripting languages as well.

llvm-c could be extended to expose more details of the LLVM IR which would allow more powerful program analysis and transformations.

Yes.

But maybe this negates the design goals for llvm-c.

No.

Is the functionality provided by llvm-c kept intentionally simple for some reason?

They are constrained only by contributions. Patches for additional bindings are welcome.

The only willful omission from the bindings are "detached" instructions. Instructions must be inserted into a basic block at creation time and can only be moved or erased (remove+delete). Likewise globals, aliases, functions, parameters, basic blocks, etc. This greatly simplifies memory management for garbage collected languages.

-- Gordon

Is the functionality provided by llvm-c kept intentionally simple
for some reason?

They are constrained only by contributions. Patches for additional
bindings are welcome.

I see. In summary, the answer to my question seems to be that llvm-c is not restricted on purpose (besides the exception of detached instructions), but functions are added on demand rather than trying to mirror the whole C++ API to C.

Thanks to everyone for your answers. This will help me finding the best way to continue my work.

Best regards,
  Christian