GSoC Proposal: Language bindings via. SWIG

Hi,

I've been lurking around the LLVM project for a couple of months now.
The two recent threads about python bindings for LLVM ([1] and [2]),
combined with the fact that I am looking for at GSoC project at the
moment. Lead to the idea of making the "public" parts of LLVM
SWIG[3]-friendly and basing a set of python bindings on this.

My reasoning for doing it this way, is that it allows reuse of the SWIG
annotation for bindings to other languages such as Perl and Java among
others.

What I would like to do for the GSoC is:
1) Annotate the necessary parts of LLVM for processing by SWIG.
2) Use SWIG to generate a python wrapper around LLVM
3) Add a (hopefully) thin layer of python code to complete the bindings.

Before submitting an application for this project. I would like to know
if there is an interest in this, within the LLVM project, and possibly
if this is already being worked on.

Regards,
  Søren Bøg

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2008-March/013318.html
[2] http://lists.cs.uiuc.edu/pipermail/llvmdev/2008-March/013171.html
[3] http://www.swig.org/

If SWIG can be made to do a good job with Python/Ruby/Perl etc bindings around LLVM, I would be very interested in this. I’m personally interested in seeing both Python and Ruby bindings, and in having them be as easily maintained as possible. I think it would be interesting to see what the SWIG-style solution can do in this direction as opposed to the C-binding approach. If it results in better and/or lower maintenance/development cost bindings for specific target languages, I’m all for it.

-Chandler

To anyone on #llvm I'm sure I'm starting to sound like a broken record,
but I'd just like to point out that for python bindings at least, you
can quite easily manipulate the LLVM infrastructure via ctypes as a
shared object / dll -- no C required! Those of us interested in talking
to LLVM from Lisps, either Common Lisp (via CFFI) or a scheme like
PLT/Mzscheme, can also use the shared library interface. In fact, for
the Common Lisp case, this is the _only portable_ (cross-implementation)
way of talking to LLVM.
      
If anyone's interested in

Here is a one-off example of python using the shared library approach
and Gordon's excellent C interface (llvm-c):

http://pastebin.com/m5197c5e7

(the verbosity at the beginning is because of some linkage issues with
the llvm SOs)
    ...Eric

To anyone on LLVM Project I’m sure I’m starting to sound like a broken record,
but I’d just like to point out that for python bindings at least, you
can quite easily manipulate the LLVM infrastructure via ctypes as a
shared object / dll – no C required!

I think it is quite worth while to have a “native” binding in the language. That is, one which meshes to a greater extent with the languages native object model (or provides a functional model as appropriate), naming, and style conventions. That is why I’m particularly curious about a SWIG approach, and the possibility of lowering the effort of this. Basically, it’s often not worth building bindings by hand. If SWIG can do it automatically and for multiple languages? That might be worthwhile. Just my 2 cents. This is in no way to say I don’t really really like the C-binding for FFI based interfaces. =]

-Chandler

but I'd just like to point out that for python bindings at least, you
can quite easily manipulate the LLVM infrastructure via ctypes as a
shared object / dll -- no C required! Those of us interested in talking

ctypes brings with it it's own troubles. To use it in any non-trivial way,
one must write enough non-trivial wrapper code in Python. Performance
is also affected (though I haven't measured it myself).

Boost.Python[1] is another option.

All said, it is not so difficult to write Python extensions in C (which
probably is one reason so many glue tools exist?).

Regards,
-MD.

[1] http://www.boost.org/libs/python/doc/index.html

> but I'd just like to point out that for python bindings at least, you
> can quite easily manipulate the LLVM infrastructure via ctypes as a
> shared object / dll -- no C required! Those of us interested in talking

ctypes brings with it it's own troubles. To use it in any non-trivial way,
one must write enough non-trivial wrapper code in Python. Performance
is also affected (though I haven't measured it myself).

In my experience, even when combining two languages with fairly similar
object-orientation semantics like python and C++ , you always have to
write a decent chunk of glue code. When the semantics are wildly
different, the amount of glue code becomes non-trivial. The question
becomes, do you want to write your bindings in a language like C or in
your native language (which you ostensibly love more than C anyway, or
you probably wouldn't be creating the bindings in the first place).

Even SWIG isn't a panacea, and still forces you to write a layer on top
of the resulting swig objects if you want an interface that looks
appropriate for your native language.

Boost.Python[1] is another option.

Boost.Python is an awesome tool for wrapping simple projects, and I've
used it in the past for four or five somewhat-complex object
hierarchies. We had many object lifecycle issues that ended up rendering
it unusable with our existing high-performance code (some numeric, some
network). Also, it does all manner of C++ template magic, and when it
breaks / fails to do what you want it too / changes in a subtle way
between boost revisions, god help you.

All said, it is not so difficult to write Python extensions in C (which
probably is one reason so many glue tools exist?).

I'd argue that python's dog-slow performance in some areas ("Help me,
PyPy project, you're my only hope!") coupled with python coders
reluctance to touch C with a ten-foot pole also contributes to the
proliferation of these tools :slight_smile:

And of course, all I'm really arguing for here is for the LLVM build
process (Which, after an hour, I still don't understand) to build the
shared libraries correctly so that I can use it from lisp :slight_smile:

      ...Eric