Garbage collection implementation

I am using LLVM as the last stage of a compiler in order to easily produce a
binary in native code. My compiler is implemented in Ocaml and has various
layers of languages. In the last layer prior to LLVM, I have a value which has
been converted to CPS, closure and hoisting (of functions).

I am now trying to write a garbage collector for this language. The shadow
stack is not suitable for me, because essentially there is no stack in my case!
All the function calls are tail calls and without tail recursive optimization
and stack re-use, the binary is useless.

My goal for now is to write a simple stop-the-world, semispace collector. To
cut a long story sort, I need to implement an allocation function like this:

if ( there is enough space ) {
  allocate space;
} else {
  dump all live physical registers on the stack;
  garbage collect;
  restore from stack, using the new locations;
}

The tricky part is the "dump all live registers". I spent quite some time
reading the documentation and have come up to two alternatives. Either
implement a new pass which reads the live variable analysis and spills all
registers to the stack at this point, or implement a new intrinsic which spills
all registers to the stack (after gc, I have to reload registers manually).

Neither of these ideas appears easy, and I am wondering if there is a simpler
way around this. To my understanding, there is neither a register map or any
sort of runtime-contract regarding the registers. Also, I need to work my way
among the LLVM optimizations which move values around registers (for instance
the fastcc convention which is essential for tail recursive optimization).

I am using LLVM as the last stage of a compiler in order to easily produce
a binary in native code. My compiler is implemented in Ocaml and has
various layers of languages. In the last layer prior to LLVM, I have a
value which has been converted to CPS, closure and hoisting (of functions).

I am now trying to write a garbage collector for this language. The shadow
stack is not suitable for me, because essentially there is no stack in my
case!

I don't follow. Shadow stack is overkill because you only ever have one stack
frame but it should work perfectly.

All the function calls are tail calls and without tail recursive
optimization and stack re-use, the binary is useless.

My goal for now is to write a simple stop-the-world, semispace collector.
To cut a long story sort, I need to implement an allocation function like
this:

if ( there is enough space ) {
  allocate space;
} else {
  dump all live physical registers on the stack;
  garbage collect;
  restore from stack, using the new locations;
}

The tricky part is the "dump all live registers".

The solution I used in HLVM is essentially to always keep all
registers "dumped". Specifically, they are dumped as soon as they go live and
reclaimed at the end of each function. That is grossly inefficient in theory
but it is easy and, in fact, performance can still be very good indeed (e.g.
several times faster than OCaml).

I spent quite some time
reading the documentation and have come up to two alternatives. Either
implement a new pass which reads the live variable analysis and spills all
registers to the stack at this point, or implement a new intrinsic which
spills all registers to the stack (after gc, I have to reload registers
manually).

Neither of these ideas appears easy, and I am wondering if there is a
simpler way around this.

The simplest way is surely to reuse HLVM because it provides everything you
need and is even written in the right language! :wink:

Hi Angelos,

I think in theory you should be able to use LLVM's shadow stack support with your runtime, but you'll need to save and restore the shadow stack's head pointer along with the stack register. Did you find that using the shadow stack inhibits TCO? If so, that might be fixable.

— Gordon

Angelos Manousaridis wrote:

I am using LLVM as the last stage of a compiler in order to easily produce a
binary in native code. My compiler is implemented in Ocaml and has various
layers of languages. In the last layer prior to LLVM, I have a value which has
been converted to CPS, closure and hoisting (of functions).

I am now trying to write a garbage collector for this language. The shadow
stack is not suitable for me, because essentially there is no stack in my case!
All the function calls are tail calls and without tail recursive optimization
and stack re-use, the binary is useless.

My goal for now is to write a simple stop-the-world, semispace collector. To
cut a long story sort, I need to implement an allocation function like this:

if ( there is enough space ) {
  allocate space;
} else {
  dump all live physical registers on the stack;
  garbage collect;
  restore from stack, using the new locations;
}

The tricky part is the "dump all live registers". I spent quite some time
reading the documentation and have come up to two alternatives. Either
implement a new pass which reads the live variable analysis and spills all
registers to the stack at this point, or implement a new intrinsic which spills
all registers to the stack (after gc, I have to reload registers manually).
  

The latter should be relatively easy to implement provided that you are
comfortable with the assembly language for your target architecture.
LLVM IR supports inline assembly. You can write an LLVM transform that
inserts one inline assembly instruction that saves all the general
purpose registers and another that restores all the general purpose
registers.

In this way, you don't have to write a code generator pass (which might
offer better optimization but may be more work).

If you need example code for register save/restore on i386, please let
me know. I wrote some code for the context switching instructions for
the Secure Virtual Architecture (SVA) project which must save and
restore registers; I can ask my advisor if we can provide those to you.

-- John T.