Hello! I'm interested in using LLVM as a target for a compiler I've
written in common lisp (SBCL). While I looked at perhaps wrapping the
LLVM C++ interface, wrapping C++ in, well, anything not C++ is a pain.
Someone on IRC mentioned that they didn't think I'd miss out on any
functionality by directly emitting IR, but suggested I query the list.
Do I miss out on any optimizations or other neat trickery by using the
IR directly? Does anyone know of any other platforms which directly
target LLVM via emitting IR?
Thanks,
...Eric Jonas
Hello! I’m interested in using LLVM as a target for a compiler I’ve written in common lisp (SBCL). While I looked at perhaps wrapping the LLVM C++ interface, wrapping C++ in, well, anything not C++ is a pain. Someone on IRC mentioned that they didn’t think I’d miss out on any functionality by directly emitting IR, but suggested I query the list.
Do I miss out on any optimizations or other neat trickery by using the IR directly? Does anyone know of any other platforms which directly target LLVM via emitting IR?
Hi Eric,
There are 3 major ways to tackle generating LLVM IR from a front-end:
• Embed the LLVM C++ code.
for: best tracks changes to the LLVM IR, .ll syntax, and .bc format
for: enables running LLVM optimization passes without a emit/parse cycle
for: adapts well to a JIT context
against: lots of ugly glue code to write
• Emit LLVM assembly from your compiler’s native language.
for: very straightforward to get started
against: the .ll parser is slower than the bitcode reader when interfacing to the middle end
against: you’ll have to re-engineer the LLVM IR object model and asm writer in your language
against: it may be harder to track changes to the IR
• Emit LLVM bitcode from your compiler’s native language.
for: can use the more-efficient bitcode reader when interfacing to the middle end
against: you’ll have to re-engineer the LLVM IR object model and bitcode writer in your language
against: it may be harder to track changes to the IR
If you go with the first option, the C bindings in include/llvm-c should help a lot, since most languages have C FFIs. The C interface was designed to require very little manual memory management, and so is fairly straightforward to talk to with most FFIs.
— Gordon
Hi,
• Emit LLVM assembly from your compiler’s native language.
for: very straightforward to get started
against: the .ll parser is slower than the bitcode reader when interfacing to the middle end
against: you’ll have to re-engineer the LLVM IR object model and asm writer in your language
against: it may be harder to track changes to the IR
One more problem with this: in order to emit float constants you have to convert them to hexadecimal notation which I found nontrivial. My pet language is still lacking proper float support because of this. I’m not aware whether there are more pitfalls like this lurking around. (Time to finally move to the ocaml bindings… :))
HTH,
Jan
Jan Rehders wrote:-
Hi,
>? Emit LLVM assembly from your compiler's native language.
>for: very straightforward to get started
>against: the .ll parser is slower than the bitcode reader when
>interfacing to the middle end
>against: you'll have to re-engineer the LLVM IR object model and asm
>writer in your language
>against: it may be harder to track changes to the IR
One more problem with this: in order to emit float constants you have
to convert them to hexadecimal notation which I found nontrivial. My
pet language is still lacking proper float support because of this.
I'm not aware whether there are more pitfalls like this lurking
around. (Time to finally move to the ocaml bindings.. :))
APFloat can emit hex representation of any float to desired precision
and rounding method.
Neil.
• Emit LLVM assembly from your compiler's native language.
for: very straightforward to get started
against: the .ll parser is slower than the bitcode reader when
interfacing to the middle end
against: you'll have to re-engineer the LLVM IR object model
and asm writer in your language
against: it may be harder to track changes to the IR
I think I'm going to try this second one, and produce a nice set of
lisp-like bindings for emitting the IR. Are there any optimizations or
other LLVM features that I'll miss out on by going this route?
...Eric
Nope, the assembly format is a first-class representation.
— Gordon