Deleting function IR after codegen

YES. My use of LLVM involves an app that JITs program after program and will quickly swamp memory if everything is retained. It is crucial to aggressively throw everything away but the functions we still need to execute.

I've been faking it with old JIT (llvm 3.4/3.5) by using a custom subclass of JITMemoryManager that squirrels away the jitted binary code so that when I free the Modules, ExecutionEngine, and MemoryManager, the real memory that got allocated for the binary code is hidden and does not get deallocated.

Currently I'm struggling with bringing my code base up to MCJIT without losing this ability, because the memory consumption is killing me.

Sometimes I think that clang as the canonical user of LLVM does not reflect the diversity of JIT-oriented LLVM use cases. An "offline" compiler like clang gets to exit after compiling a module, but other apps using LLVM may JIT module after module after module indefinitely.

For that kind of use case, it would be great to have as a first-class feature the ability to free the IR of a compiled module, and even better, to throw away the Module and EE entirely but keep the ability to call into the JITed binary function. Many apps would benefit from a stable API for doing this.

  -- lg

FWIW, LLILC (https://github.com/dotnet/llilc) uses MCJIT with a custom memory manager to hold onto the binary bits and discard the rest.

As far as I know it doesn't leak, though we don't blow away the context, so that grows a bit over time.

From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org]
On Behalf Of Larry Gritz via llvm-dev
Subject: Re: [llvm-dev] Deleting function IR after codegen

I've been faking it with old JIT (llvm 3.4/3.5) by using a custom subclass of
JITMemoryManager that squirrels away the jitted binary code so that when I free
the Modules, ExecutionEngine, and MemoryManager, the real memory that got allocated
for the binary code is hidden and does not get deallocated.

Currently I'm struggling with bringing my code base up to MCJIT without losing this
ability, because the memory consumption is killing me.

For that kind of use case, it would be great to have as a first-class feature
the ability to free the IR of a compiled module, and even better, to throw away
the Module and EE entirely but keep the ability to call into the JITed binary

This is precisely what we do for our environment. Updating for MCJIT took a bit of work, but not too much, using SectionMemoryManager as the base class.

When OrcJIT arrived, we realized we didn't actually need the ExecutionEngine or an LLVM-provided JIT at all. This simplified our code and removed some undesirable dependencies. What we have now essentially just implements the logic of MCJIT's finalizeObject(), finalizeLoadedModules(), emitObject(), and generateCodeForModule() methods coupled with the aforementioned extension of SectionMemoryManager (effectively our own ExecutionEngine).

- Chuck

Thanks for the pointer, it's always helpful to be able to see how another project solved similar problems.

Thanks for the pointer, it’s always helpful to be able to see how another project solved similar problems.

CCed Lang, although I think Andy already answered all the questions you had.

@Lang, would it be possible for one of the kaleidoscope tutorials to demonstrate how to hook up freeing the Module/Context but still run any binary bits? Or are some of them already doing so?

Cheers,
Pete

Hi Pete, Larry,

@Lang, would it be possible for one of the kaleidoscope tutorials to demonstrate how to hook up freeing the Module/Context but still run any binary bits?

The new ORC-based Kaleidoscope tutorials are actually already doing this. You get this behavior more-or-less for free if you use the ORC JIT APIs, because of an interplay between two features:

(1) All the ORC components try to destroy their module and object-file pointers as soon as they can, and

(2) The module and object pointer types are template types.

So, if you pass your Modules in as 'unique_ptr's, the JIT will delete them as soon as its done with them, freeing the memory in the process.
The executable code for the JIT is owned by the user’s memory manager which is managed using a similar scheme except that the memory manager pointer lives until the JIT is torn down, or the user explicitly asks the JIT to discard it.

Cheers,
Lang.

Outstanding. Maybe, then, it’s best for me to skip straight to a fully modern LLVM that has ORC JIT, rather than trying to incrementally step through and support each LLVM release which requires me (for 3.6 at least) to figure it out for straight MCJIT.

I appreciate all the pointers here. This is immensely helpful.