LLVM's GC support

Hi Paul,

I hope you don’t mind me sending this mail directly to you (instead of to llvm-dev), but you seem to be the expert on LLVM’s GC support :slight_smile: If you’d rather have me send to llvm-dev, please say so.

You’ll reach a wider audience with the list, though I haven’t been able to keep up with it lately.

I’m trying to get a very simple copying collector to work with LLVM, basically your standard semi-space collector with Cheney scan, using the included shadow stack support to track roots. I’m writing everything in C, using llvm-gcc’s gcroot attribute in a test app to mark GC roots where appropriate, which seems to work fine when translating to LLVM IR (and finally an executable).

Be forewarned that this attribute should be considered ‘experimental’ at best.

Implementing a GC is actually quite new to me, so perhaps that’s where I’m getting confused. Anyways, suppose I have a function that has a pointer to a GC-tracked object as a parameter and a collection can be triggered from this function, e.g.

object*
f(object *some_obj)
{
object attribute((gcroot)) *new_obj;
new_obj = allocate_object();
// A collection could have been executed at this point,
// and so some_obj might have become invalid
new_obj->child = some_obj;
return new_obj;
}

To ensure that *some_obj isn’t collected prematurely, your code will need to copy the value of some_obj into a gcroot local at the top of the function body.

You also need to beware of temporaries:

object *f();
object *g();
void h(object *x, object *y);

void func() {

// dangerous
object __attribute((gcroot)) *result1 = h(f(), g());

// safer

object __attribute((gcroot)) *temp1 = f();
object __attribute((gcroot)) *temp2 = g();

object __attribute((gcroot)) *result2 = h(temp1, temp2);

}

How can I ensure that the pointer parameter (or any other root) is updated during the collection phase? It seems that LLVM’s shadow stack data structures only provide the root objects pointed to and not the memory locations of the pointer variables holding the roots.

The shadow stack structure is itself the storage location for the variables; the LLVM IR is rearranged to read the stack roots directly from the shadow stack.

— Gordon

Hi Gordon,

Gordon Henriksen wrote:

I hope you don't mind me sending this mail directly to you (instead
of to llvm-dev), but you seem to be the expert on LLVM's GC support
:slight_smile: If you'd rather have me send to llvm-dev, please say so.

You'll reach a wider audience with the list, though I haven't been
able to keep up with it lately.

I'm trying to get a very simple copying collector to work with
LLVM, basically your standard semi-space collector with Cheney scan,
using the included shadow stack support to track roots. I'm writing
everything in C, using llvm-gcc's gcroot attribute in a test app to
mark GC roots where appropriate, which seems to work fine when
translating to LLVM IR (and finally an executable).

Be forewarned that this attribute should be considered ‘experimental’
at best.

Ok

Implementing a GC is actually quite new to me, so perhaps that's
where I'm getting confused. Anyways, suppose I have a function that
has a pointer to a GC-tracked object as a parameter and a collection
can be triggered from this function, e.g.

object*
f(object *some_obj)
{
   object __attribute__((gcroot)) *new_obj;
   new_obj = allocate_object();
   // A collection could have been executed at this point,
   // and so some_obj might have become invalid
   new_obj->child = some_obj;
   return new_obj;
}

To ensure that *some_obj isn't collected prematurely, your code will
need to copy the value of some_obj into a gcroot local at the top of
the function body.

You also need to beware of temporaries:

    object *f();
    object *g();
    void h(object *x, object *y);

    void func() {

      // dangerous
      object __attribute((gcroot)) *result1 = h(f(), g());

      // safer

      object __attribute((gcroot)) *temp1 = f();
      object __attribute((gcroot)) *temp2 = g();

      object __attribute((gcroot)) *result2 = h(temp1, temp2);

    }

How can I ensure that the pointer parameter (or any other root)
is updated during the collection phase? It seems that LLVM's shadow
stack data structures only provide the root objects pointed to and
not the memory locations of the pointer variables holding the roots.

The shadow stack structure is itself the storage location for the
variables; the LLVM IR is rearranged to read the stack roots directly
from the shadow stack.

Ah, clever trick. Thanks for all the comments above, I now seem to have
my stuff working!

Paul