Saving/restoring executable code from the the JIT?

Hi.

IIRC some time ago there was some discussion about saving the executable
code produced by the JIT to a file, for loading it at the next
session. This would require to stream out the executable code before
externals are resolved and resolve them when the code is loaded.

AFAIK, LLVM does not support this at the moment.

What's the difficulty of implementing this feature, on terms of existing
infrastructure etc?

It's not clear how valuable the feature is if users start to do things
like inlining pointers to heap allocated objects with guards as
optimizations.

I could imagine an IR class representing a pointer to an object on the
heap that gets serialized to bitcode as some kind of relocatable
symbol, and then re-resolved at runtime. If the corresponding symbol
doesn't exist at runtime, the implementation could choose to drop the
particular function on the floor and recompile from source.

Reid

That reminds me of network agents... One could send the executable
through the network to execute on remote machines (collecting local
statistics, for instance). Despite all the security and locality
concerns, it's a pretty cool idea.

cheers,
--renato

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

It's not clear how valuable the feature is if users start to do things
like inlining pointers to heap allocated objects with guards as
optimizations.

or what about register variables, or objects half as registers half in
memory, or optimizations that exists on one machine (presence of FPU,
MMU, etc) and not on others, byte sex, word size and many other
problems... All of this is the job of serialization engine.

I suppose the JIT vm has already many solutions for the cross-platform
compatibility, but there will still be some lost context from one
place/time to another.

I could imagine an IR class representing a pointer to an object on the
heap that gets serialized to bitcode as some kind of relocatable
symbol, and then re-resolved at runtime. If the corresponding symbol
doesn't exist at runtime, the implementation could choose to drop the
particular function on the floor and recompile from source.

You could loose the program's state with that, which I think is the
whole point. Maybe having a class that saves all values marked with a
specific metadata (all the rest could potentially be reinitialized) as
a globally accessed symbol and re-read (and deleted) at restart.

It's responsibility of the programmer to say what's to be saved and
what can be safely reinitialized. The IRBuilder could annotate during
allocation, so the JIT would know which ones to save on suspend and
keep track where they are.

cheers,
--renato

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

Renato Golin <rengolin@systemcall.org> writes:

or what about register variables, or objects half as registers half in
memory, or optimizations that exists on one machine (presence of FPU,
MMU, etc) and not on others, byte sex, word size and many other
problems... All of this is the job of serialization engine.

[snip]

My goal is to avoid waiting several minutes on application startup until
optimization and native codegen ends. No suspending/restarting, just
compiling, saving native generated code, and loading on future sessions,
more akin a conventional compiler.

Generating a dll is not an option because:

1. It will require distributing a linker.

2. It will require the presence of an import library satisfying the
external symbols referenced by the dll, and some of those symbols are
unknown until the application starts (this may not apply to unixes, but
it does for windows). Essentially, LLVM would have to do the linker's
work when the previously saved native code is loaded back by the
application on the next session.

3. Some other reasons I forgot when studied the idea on the past.

My goal is to avoid waiting several minutes on application startup until
optimization and native codegen ends. No suspending/restarting, just
compiling, saving native generated code, and loading on future sessions,
more akin a conventional compiler.

Saving IR shouldn't be any problem, I guess there is already a way of
doing this, and re-reading it by the JIT again.

2. It will require the presence of an import library satisfying the
external symbols referenced by the dll, and some of those symbols are
unknown until the application starts (this may not apply to unixes, but
it does for windows). Essentially, LLVM would have to do the linker's
work when the previously saved native code is loaded back by the
application on the next session.

No sense when you're just running on the VM. More hassle than savings.
The JVM is a good example on how things go bad when you're linking
native code.

cheers,
--renato

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm