JVM bytecode generation vs. LLVM

Sorry if I'm repeating something that was already said.

I was just thinking "why the heck do I seem headed for JVM generation if what I want to use is LLVM", and this is the result:

I'm coming from a Java background. I'm using Eclipse, I'm used to the syntax highlighting, cross referencing and refactoring support that Eclipse offers.
I know I will want to have the same infrastructure for my language, and I want it written in my language. I WILL need a JVM backend, no matter what.

Now, I'd still love to use LLVM. It has a lot to offer for the phases "above" code generation. I don't need register allocation, but I'd like to make use of common constant elimination, loop unrolling, inlining, or the pass management infrastructure; that's a whole lot of code I don't need to write.
And when it comes to generating raw machine code, I can confidently say: develop in Eclipse and run the stuff as JVM code, but deploy using the machine-code backend provided by LLVM.

So my conclusion is:
To make LLVM attractive for us Java-based language designers, we need the means to write a JVM backend.
The actual backend is easy, libraries for class file and JAR generation exist.
I'd need help for:
* Determining where exactly the line is drawn between "this LLVM component is useful for JVM bytecode generation" and "this LLVM component isnt". (Constant folding would be, register allocation would not, but there's a lot of gray areas between these two.)
* Not being a JNI or C++ expert, building the JNI infrastructure that would allow calling LLVM components from Java.
* Not being a true Eclipse expert, wrapping LLVM binaries as Eclipse plugins. Eclipse expects plugins to be available for download via HTTP, with some XML that describes dependencies. Setting this up would be easy, getting the details right would be work.

That's just my specific skillset, other language designers might have different ones, but I guess it is not very likely that the exact right combination will come up easily. There aren't many people around who're experts in C++, Java, and Eclipse.

Oh, and the question I'm having is: Is LLVM for me?

Regards,
Jo

I'd need help for:
* Determining where exactly the line is drawn between "this LLVM
component is useful for JVM bytecode generation" and "this LLVM
component isnt". (Constant folding would be, register allocation would
not, but there's a lot of gray areas between these two.)
* Not being a JNI or C++ expert, building the JNI infrastructure that
would allow calling LLVM components from Java.

I'm experimenting with this: built LLVM and clang DLLs, then used JNAerator/Bridj
(Google Code Archive - Long-term storage for Google Code Project Hosting.) to make java wrappers for them.

There are some various issues, Bridj has an ANTLR-based parser that chokes a bit
on complicated headers like LLVM's, but it can be made to work. I've got a fairly
complete wrap of clang-c working, and am making progress with llvm-c.

Oh, and the question I'm having is: Is LLVM for me?

I don't know either, but it certainly seems like an interesting path to me!

Kevin Kelley

Hi Joachim,

Sorry if I’m repeating something that was already said.

I was just thinking “why the heck do I seem headed for JVM generation if
what I want to use is LLVM”, and this is the result:

I’m coming from a Java background. I’m using Eclipse, I’m used to the
syntax highlighting, cross referencing and refactoring support that
Eclipse offers.
I know I will want to have the same infrastructure for my language, and
I want it written in my language. I WILL need a JVM backend, no matter what.

Now, I’d still love to use LLVM. It has a lot to offer for the phases
“above” code generation. I don’t need register allocation, but I’d like
to make use of common constant elimination, loop unrolling, inlining, or
the pass management infrastructure; that’s a whole lot of code I don’t
need to write.
And when it comes to generating raw machine code, I can confidently say:
develop in Eclipse and run the stuff as JVM code, but deploy using the
machine-code backend provided by LLVM.

So my conclusion is:
To make LLVM attractive for us Java-based language designers, we need
the means to write a JVM backend.
The actual backend is easy, libraries for class file and JAR generation
exist.
I’d need help for:

  • Determining where exactly the line is drawn between “this LLVM
    component is useful for JVM bytecode generation” and “this LLVM
    component isnt”. (Constant folding would be, register allocation would
    not, but there’s a lot of gray areas between these two.)

I guess the existing line between opt/llc is similar to yours. You’re just interested in llvm bitcode optimizations that produce a different, hopefully optimized bitcode. That’s what the “opt” binary does. Register allocation is target specific and does not “produce” any bitcode.