reload of pointers after GC

Hi

I'm using a GC that's pretty similar to the OCaml one. It records
stack locations using llvm.gcroot, and dumps out a frametable
describing the live stack offsets so that the GC runtime can walk them
as required. I'm on 2.3, not svn head.

I'm having some trouble with pointers being cached in registers across
function calls (at least x86 backend, haven't tried others yet). The
steps are:

1. Allocate an array (A) in the GC heap
2. Do operations on A
3. Allocate another object (B) in the GC heap
4. Do some more operations on A (actually, pass to the constructor of B)

In this case, the pointer is stored back to the stack before the call
to the second allocation. During the second allocation, memory is
exhausted and the collector must relocate some objects. It moves stuff
around, including A, and fixes up the stack pointer to A. So, up to
here, all good. But, upon returning, the pointer value isn't reloaded
from the stack, and so points to (now) garbage.

It seems that simply passing the address of a stack variable to any
function (llvm.gcroot or otherwise) would be enough to require
reloading registers after any future function call since the data
could have been changed, so I'm wondering whether I'm missing
something.

I haven't made a standalone repro .ll yet, I just wanted to understand
if this was expected or not first. Does this seem like it should be
enough? Or are there extra invalidations that I need to do somewhere?

thanks,
scott

Hi Scott,

I'm using a GC that's pretty similar to the OCaml one. It records stack locations using llvm.gcroot, and dumps out a frametable describing the live stack offsets so that the GC runtime can walk them as required. I'm on 2.3, not svn head.

I'm having some trouble with pointers being cached in registers across function calls (at least x86 backend, haven't tried others yet).

Are you reusing the same SSA variables, or reloading from the gcroot? Copying collectors require that you reload before each use (or, more specifically, between each call that could invoke the collector). e.g.,

     ; object obj;
     %root = alloca
     call void llvm.gcroot(%root)

     ; obj = new Object;
     %obj.1 = gcalloc()
     store %obj.1 -> %obj.root

     ; danger(); // Could invoke the collector!
     call void danger()

     ; use(obj);
     %obj.2 = load %obj.root
     call void use(%obj.2) ; Not %obj.1!

     ; danger(); // Could invoke the collector!
     call void danger()

     ; use(obj);
     %obj.3 = load %obj.root
     call void use(%obj.3) ; Not %obj.2!

Of particular danger is that naive code generation for something like f(obj, g()) would retain an SSA value for obj over the call to g().

It seems that simply passing the address of a stack variable to any function (llvm.gcroot or otherwise)

That's the basis for the llvm.gcroot model.

I haven't made a standalone repro .ll yet

This would be helpful.

— Gordon

Are you reusing the same SSA variables, or reloading from the gcroot?
Copying collectors require that you reload before each use (or, more
specifically, between each call that could invoke the collector). e.g.,

Ah, of course. I keep thinking of the SSA variables as C-type
variables, rather than values. Thanks (yet again!) Gordon.

Of particular danger is that naive code generation for something like
f(obj, g()) would retain an SSA value for obj over the call to g().

Yeah, that would be my problem. It would be quite fair to describe my
code generation as naive. :slight_smile:

scott