Python bindings?

Hi,

Are there Python bindings for LLVM?

Apparently there was one ~2005; has this been updated since? Is anyone
working on this?

Is the LLVM dev community interested in this?

Thanks & Regards,
-Mahadevan.

Mahadevan R wrote:

Hi,

Are there Python bindings for LLVM?

Apparently there was one ~2005; has this been updated since? Is anyone
working on this?

Is the LLVM dev community interested in this?

I'm curious. What would you use this for?

HI Nick,

> Are there Python bindings for LLVM?

I'm curious. What would you use this for?

Mainly because it'd be easier to play around
with the LLVM APIs, to create toy languages
like the Kaleidoscope (from the tutorial).

For the LLVM development itself, perhaps it
can also be used to create unit/regression
test scripts.

Its also a good way to learn LLVM :wink:

Regards,
-Mahadevan.

Are there Python bindings for LLVM?

I'm not aware of any. The PyPy compiler pipes LLVM assembly to llc rather than building the C++ IR in memory.

Apparently there was one ~2005; has this been updated since? Is anyone working on this?

Is the LLVM dev community interested in this?

Yes!

Note that C bindings have been introduced since 2005, so there may be a different route available than was taken then. Look in include/llvm-c. The intent of the C bindings is to enable high-level language bindings. The current focus is on enabling front-end compilers. Ocaml and Haskell bindings have been developed atop them, the former being in the LLVM source tree.

— Gordon

Note that C bindings have been introduced since 2005, so there may be
a different route available than was taken then. Look in include/llvm-
c. The intent of the C bindings is to enable high-level language
bindings. The current focus is on enabling front-end compilers. Ocaml
and Haskell bindings have been developed atop them, the former being
in the LLVM source tree.

1)
Are the C bindings complete? That is, is there some part of the C++ API
that is not exposed by the C API?

2)
Do the Ocaml/Haskell bindings follow that language's naming conventions?
Or LLVM's? For e.g., in Python method names are usually like_this. So
which of these are preferred:

  Builder.set_insert_point()

or

  Builder.SetInsertPoint()

?

Regards,
-MD.

1)
Are the C bindings complete? That is, is there some part of the C++ API
that is not exposed by the C API?

Nope, there's still a lot that's not done. Patches are always welcome
:slight_smile: We've got enough in subversion to implement the Kaleidoscope
tutorial though.

2)
Do the Ocaml/Haskell bindings follow that language's naming conventions?
Or LLVM's? For e.g., in Python method names are usually like_this. So
which of these are preferred:

  Builder.set_insert_point()

or

  Builder.SetInsertPoint()

I can't speak for the haskell bindings, but the ocaml bindings do not.
We use the lowercase/underscore format traditionally used in ocaml
projects. We don't need to bind all of the helper functions and
methods so the api can be kept a little smaller. They also might be
named differently and the semantics can be changed. For instance, the
function "createPromoteMemoryToRegisterPass" creates a Pass object
that we can add to a PassManager, but in ocaml we have
"add_memory_to_register_promotion" which takes a PassManager as an
argument and adds it inside the binding. This makes memory management
a bit simpler.

1)
Are the C bindings complete? That is, is there some part of the C++ API that is not exposed by the C API?

Nope, there's still a lot that's not done. Patches are always welcome :slight_smile: We've got enough in subversion to implement the Kaleidoscope tutorial though.

I like to think we're binding with a goal. :slight_smile: So when a need is satisfied, it's “done” for that need. As Erick well knows, I'm happy to accept patches to extend the bindings into any areas of interest.

Very much of LLVM's functionality is either truly or essentially private, so varying definitions of “everything” are either pointless or unsatisfiable.

2)
Do the Ocaml/Haskell bindings follow that language's naming conventions?
Or LLVM's? For e.g., in Python method names are usually like_this. So
which of these are preferred:

Builder.set_insert_point()

or

Builder.SetInsertPoint()

I can't speak for the haskell bindings, but the ocaml bindings do not. We use the lowercase/underscore format traditionally used in ocaml projects.

Tradition, schmadition; capitalization is part of the Objective Caml syntax. :stuck_out_tongue:

We don't need to bind all of the helper functions and methods so the api can be kept a little smaller. They also might be named differently and the semantics can be changed. For instance, the function "createPromoteMemoryToRegisterPass" creates a Pass object that we can add to a PassManager, but in ocaml we have "add_memory_to_register_promotion" which takes a PassManager as an argument and adds it inside the binding. This makes memory management a bit simpler.

Something that the C bindings provide that I think is actually valuable is a considerably simplified memory ownership model, particularly for the IR. The C++ side doesn't participate in GC, and it is therefore not knowable whether an object needs to be disposed of or not when the managed handle to it become garbage.

Since the C bindings don't expose the features that allow for the creation of detached IR objects, 'delete Module' is guaranteed to to clean up any IR objects properly. Such detached objects are not especially useful to compiler front-ends. The bulk of memory management simply disappears from both bindings and client code using this model. In particular, bindings don't need to implement finalizers.

— Gordon