appropriate for run-time compilation of DSL?

I wish to know whether LLVM is an appropriate choice for a project I'm
working on (excuse me if this is the wrong list for this question).

I have a domain-specific-language (DSL) that is currently compiled to a
custom byte-code and run in a VM. This is written in C++. The DSL can
call high-level functions which are actually callbacks into the current
non-DSL environment. That is, these functions /leave/ the VM and work in
the the normal code and return results back into the VM loop.

For performance reasons I would now like to compile this code to native
machine code. What I need is to be able to produce a binary block of
data which can be distributed to other machines and run directly on
them. That is, compilation would be on a central machine and worker
machines would receive the code and execute it. From the docs and
tutorial I'm unsure how I would go about doing this step.

Can somebody confirm that LLVM can be used for what I want and point me
somewhat in the right direction?

I wish to know whether LLVM is an appropriate choice for a project I'm
working on (excuse me if this is the wrong list for this question).

I have a domain-specific-language (DSL) that is currently compiled to a
custom byte-code and run in a VM. This is written in C++. The DSL can
call high-level functions which are actually callbacks into the current
non-DSL environment. That is, these functions /leave/ the VM and work in
the the normal code and return results back into the VM loop.

That should work perfectly fine. The LLVM JIT has support for
declaring and calling out to native functions in the application
through libffi.

One catch is that if you want to call into C++ you will need to deal
with a bit of the C++ ABI for any platforms you use. For example, at
the LLVM IR level calls will use names mangled according to the ABI
rules. You should be able to use clang as a library to help here. If
you have straight C wrappers for your runtime functions, this
complexity goes away.

For performance reasons I would now like to compile this code to native
machine code. What I need is to be able to produce a binary block of
data which can be distributed to other machines and run directly on
them. That is, compilation would be on a central machine and worker
machines would receive the code and execute it. From the docs and
tutorial I'm unsure how I would go about doing this step.

There isn't great support for this in the LLVM JIT, but IIRC some
clients have managed this by subclassing the JITMemoryManager to copy
the code when the function has finished being emitted. I don't know
if it's possible to ensure that the code will be position independent.

Reid

That should work perfectly fine. The LLVM JIT has support for
declaring and calling out to native functions in the application
through libffi.

How about support for a custom memory model? I know this sounds odd, but
basically the variables need to map to a specific block of memory: the
global heap. The enclosing program uses memcpy to push/pull values
in/out of that memory.

Would this be easy to do? Keep in mind I already generate byte-code, so
if needed I could simply emit offsets into the IR for all the places I
think it should be using.

with a bit of the C++ ABI for any platforms you use. For example, at
the LLVM IR level calls will use names mangled according to the ABI

I can use C-Wrappers if it is easier.

There isn't great support for this in the LLVM JIT, but IIRC some
clients have managed this by subclassing the JITMemoryManager to copy
the code when the function has finished being emitted. I don't know
if it's possible to ensure that the code will be position independent.

I'll start my investigation here then, as this is actually quite
critical to us -- we can't afford the final JIT/compilation overhead on
our target machines. Note that I do presume that LLVM is extremely fast,
but our requirements sit at the microsecond level, and we will have
thousands of these things to compile in a day.

Thank you for your assistance.

Then LLVM might not be the right tool for the job. It's not a very
fast JIT compiler, although this is an area we'd like to improve. To
whether it's fast enough, you could modify the Kaleidoscope example to
time compiling several functions.

Reid

Is it fairly easy to compile LLVM itself to bytecode (perhaps with the
gold plugin)? If so, there might be a combo of whole-program
optimizations that might help JIT time.