Getting started with GC

I'm in a group tasked with improving the GC of LLVM for a 421 project.
We are having trouble getting started with the given SemiSpace
collector.

We found the string llvm_gc_initialize called from a single source file
./test/Regression/CodeGen/Generic/GC/alloc_loop.ll
which we tried with the following... (showing LLVM checked out from cvs a few days ago, similar
output with release 1.3)

$ llvm-as alloc_loop.ll
$ lli alloc_loop.bc
lli: Globals.cpp:81: llvm::GlobalVariable::GlobalVariable(const llvm::Type*, bool, llvm::GlobalValue::LinkageTypes, llvm::Constant*, const std::string&, llvm::Module*): Assertion `Initializer->getType() == Ty && "Initializer should be the same type as the GlobalVariable!"' failed.
lli((anonymous namespace)::PrintStackTrace()+0x1a)[0x857f21a]
lli((anonymous namespace)::SignalHandler(int)+0xcb)[0x857f48d]
[0xffffe420]
lli((anonymous namespace)::LowerGC::doInitialization(llvm::Module&)+0x56d)[0x83ed459]
lli(llvm::PassManagerTraits<llvm::Function>::doInitialization(llvm::Module&)+0x56)[0x84de43a]
lli(llvm::FunctionPass::run(llvm::Function&)+0x4c)[0x848e720]
lli(llvm::FunctionPassManager::run(llvm::Function&)+0xe7)[0x848dcdd]
lli(llvm::JIT::runJITOnFunction(llvm::Function*)+0x4f)[0x8312da7]
lli(llvm::JIT::getPointerToFunction(llvm::Function*)+0x15f)[0x8312f59]
lli(llvm::JIT::runFunction(llvm::Function*, std::vector<llvm::GenericValue, std::allocator<llvm::GenericValue> > const&)+0x4e)[0x8311fc0]
lli(llvm::ExecutionEngine::runFunctionAsMain(llvm::Function*, std::vector<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, char const* const*)+0x2ef)[0x8369619]
lli(main+0x26c)[0x82f89b0]
/lib/tls/libc.so.6(__libc_start_main+0x108)[0x401307f8]
Aborted

How should we get this to run?

Also the code in runtime/GC/SemiSpace/semispace.c doesn't seem to do
anything other than call process_pointer for every gcroot pointer.

static void process_pointer(void **Root, void *Meta) {
  printf("process_root[0x%p] = 0x%p\n", (void*) Root, (void*) *Root);
}

Are there any examples of LLVM performing active garbage collection? In
the dev archives
(http://mail.cs.uiuc.edu/pipermail/llvmdev/2004-July/001527.html) Chris
said there are no frontends with GC support, but it might get into Java
first. Where is llvm-java? Google didn't find it in top 10, there is no
llvm-java module in cvs, find -iname java offered no hope.

Do you recommend we stick to llvm assembly for testing GC?

Tom

I'm in a group tasked with improving the GC of LLVM for a 421 project.
We are having trouble getting started with the given SemiSpace
collector.

We found the string llvm_gc_initialize called from a single source file
./test/Regression/CodeGen/Generic/GC/alloc_loop.ll
which we tried with the following... (showing LLVM checked out from cvs a few days ago, similar
output with release 1.3)

$ llvm-as alloc_loop.ll
$ lli alloc_loop.bc
lli: Globals.cpp:81: llvm::GlobalVariable::GlobalVariable(const llvm::Type*, bool, llvm::GlobalValue::LinkageTypes, llvm::Constant*, const std::string&, llvm::Module*): Assertion `Initializer->getType() == Ty && "Initializer should be the same type as the GlobalVariable!"' failed.
lli((anonymous namespace)::PrintStackTrace()+0x1a)[0x857f21a]
lli((anonymous namespace)::SignalHandler(int)+0xcb)[0x857f48d]
[0xffffe420]
How should we get this to run?

Sorry about that, patch for this problem here:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041025/019891.html

Also the code in runtime/GC/SemiSpace/semispace.c doesn't seem to do
anything other than call process_pointer for every gcroot pointer.

static void process_pointer(void **Root, void *Meta) {
  printf("process_root[0x%p] = 0x%p\n", (void*) Root, (void*) *Root);
}

Correct. The semispace collector hasn't been implemented all of the way.
The code generator and runtime interfaces provide you with all of the
hooks that you need to implement a garbage collector though.

Are there any examples of LLVM performing active garbage collection? In
the dev archives
(http://mail.cs.uiuc.edu/pipermail/llvmdev/2004-July/001527.html) Chris
said there are no frontends with GC support, but it might get into Java
first. Where is llvm-java? Google didn't find it in top 10, there is no
llvm-java module in cvs, find -iname java offered no hope.

LLVM-Java is still in development unfortunately. I believe that array
support is currently being added by Alkis and that he hasn't hooked it up
with the GC yet.

Do you recommend we stick to llvm assembly for testing GC?

Unfortunately, yes that's really the only way to go for now.

-Chris

That patch does fix the problem. Thank you. I used the following
llvm-link -f -o linked.bc alloc_loop.bc ~/llvm/runtime/GC/SemiSpace/Debug/semispace.bc
because it seemed easier than working out if and how to get dynamic
loading and linking to work. Perhaps the process will be smoother when
I restart using projects/sample/ . Running linked.bc outputs

Garbage collecting!!
process_root[0x0xbffff3f0] = 0x0x41257008
process_root[0x0xbffff3ec] = 0x0x41257012
lli((anonymous namespace)::PrintStackTrace()+0x1a)[0x857f236]
lli((anonymous namespace)::SignalHandler(int)+0xcb)[0x857f4a9]
[0xffffe420]
lli(llvm::ExecutionEngine::runFunctionAsMain...

Which is too be expected since llvm_gc_collect in
llvm/runtime/GC/SemiSpace/semispace.c ends with abort();

I believe we have our work cut out for us. Hopefully you'll have a
working GC by Christmas time.

Tom

> > $ llvm-as alloc_loop.ll
> > $ lli alloc_loop.bc
> > lli: Globals.cpp:81: llvm::GlobalVariable::GlobalVariable(const llvm::Type*, bool, llvm::GlobalValue::LinkageTypes, llvm::Constant*, const std::string&, llvm::Module*): Assertion `Initializer->getType() == Ty && "Initializer should be the same type as the GlobalVariable!"' failed.
> > lli((anonymous namespace)::PrintStackTrace()+0x1a)[0x857f21a]
> > lli((anonymous namespace)::SignalHandler(int)+0xcb)[0x857f48d]
> > [0xffffe420]
> > How should we get this to run?
>
> Sorry about that, patch for this problem here:
> http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041025/019891.html
That patch does fix the problem. Thank you. I used the following
llvm-link -f -o linked.bc alloc_loop.bc ~/llvm/runtime/GC/SemiSpace/Debug/semispace.bc
because it seemed easier than working out if and how to get dynamic
loading and linking to work. Perhaps the process will be smoother when
I restart using projects/sample/.

Okay, sounds good.

Running linked.bc outputs

Garbage collecting!!
process_root[0x0xbffff3f0] = 0x0x41257008
process_root[0x0xbffff3ec] = 0x0x41257012
lli(llvm::ExecutionEngine::runFunctionAsMain...

Which is too be expected since llvm_gc_collect in
llvm/runtime/GC/SemiSpace/semispace.c ends with abort();

I believe we have our work cut out for us. Hopefully you'll have a
working GC by Christmas time.

Yes, indeed :slight_smile: That would be great! Thanks! :slight_smile:

-Chris

We have a few questions about the current state of GC.

We decided to start (and finish?) our work by finishing SemiSpace.
process_pointer is meant to move objects from CurSpace to OtherSpace.
How can it find pointers to a moved object? How does it know the size
of each object? Assuming we are writing a GC that will only work from
llvm assembly our best option seems to be forcing the assembly code to
follow a convention that identifies pointers. The SemiSpace code could
keep a map from each allocated pointer to its size, similar to how I
think malloc works.

http://llvm.cs.uiuc.edu/docs/GarbageCollection.html (Which seems to be
the inverse of out-of-date, whatever that is)
Says that there is strong dependency between the frontend and GC
implementations. For example, it seems as though the frontend and GC
need to agree on a structure for the Meta data and they may call each
other during allocation and garbage collection. I imagine it would be
good to decouple the frontend and GC as much as possible or one may
need to be reimplemented when the other is changed.
Do you have any intentions for the use of Meta in SemiSpace?

We could trample around and make something work but it seems as
though you have built most of the support structure for GC. We are
trying to follow that plan.

Tom

We have a few questions about the current state of GC.

Ok. :slight_smile:

We decided to start (and finish?) our work by finishing SemiSpace.

Sounds good.

process_pointer is meant to move objects from CurSpace to OtherSpace.
How can it find pointers to a moved object?

This is entirely up to you, as you're basically implementing the semispace
collector in its entirety. See below.

How does it know the size of each object?

I'm a big fan of starting simple. As such, I'd suggest using this scheme:

When an object is allocated: add a size word on front of it. This size of
every object should be rounded up to a multiple of 4. When it is being
GC'd, replace the size with the forwarding pointer, but with the low bit
set. Because the low bit is set, you should be able to tell if an object
is forwarded or not. This allows you to do the simple, standard, in-place
depth-first traversal of the heap when you are GC'ing from one space to
the other.

Assuming we are writing a GC that will only work from llvm assembly our
best option seems to be forcing the assembly code to follow a convention
that identifies pointers.

There are three issues:

1. Object references from the stack
2. Object references from globals
3. Object references from the heap

#1 is the hardest from the code generation standpoint, and it is what LLVM
provides direct support for. #2 is really a matter of convention, and
having the GC export and interface that can be used by langugage front-end
in question to register global pointers. #3 requires object descriptors
of some sort, produces by the front-end and consumed by the GC. See
below.

The SemiSpace code could keep a map from each allocated pointer to its
size, similar to how I think malloc works.

For this, I would suggest just putting a header word on the object,
containing the size. This is not optimal, but will be a good place to
start. Later this can be improved.

http://llvm.cs.uiuc.edu/docs/GarbageCollection.html (Which seems to be
the inverse of out-of-date, whatever that is)
Says that there is strong dependency between the frontend and GC
implementations.

Yes. The GC needs information from the front-end, in particular how it
lays out objects and where to find object references.

For example, it seems as though the frontend and GC need to agree on a
structure for the Meta data and they may call each other during
allocation and garbage collection.

Yes. They don't actually need to 'agree', but there needs to be an API
provided by the language runtime and consumed by the GC, that is used to
encode the information. One way is to standardize maps, a more general
way is to expose API's to access the maps that the front-end produces
(allowing the front-end to produce them in any format it wants).

I imagine it would be good to decouple the frontend and GC as much as
possible or one may need to be reimplemented when the other is changed.
Do you have any intentions for the use of Meta in SemiSpace?

I'm not sure what you mean by 'meta', but if you mean metadata, yes.

We could trample around and make something work but it seems as
though you have built most of the support structure for GC. We are
trying to follow that plan.

Sounds good. This is what I envision. :slight_smile: The 'vision' encompases many
langauges and different implementation strategies, but lets start with one
that you are likely to be familiar with. Anything you do here can be
extended to the full vision, if and when you find yourself with tons of
spare time at the end of your project. :slight_smile:

Lets imagine that you have a language like Java. All GC'd objects in Java
have vtables, and from this vtable a wealth of information about the
object is accessible (e.g. it's size, whether it's an array, GC map
information, etc). The layout of objects in Java might look something
something like this:

obj ptr --\
          >
          v
    [ vtable ptr ] ----> [ rtti info ] ----> ...
    [ obj word 1 ] [ gc info ] ----> [ obj size ]
    [ obj word 2 ] [ vf ptr #1 ] [ Flags ] -> is array, etc
     ... ... [ gc map #1 ]
    [ obj word N ] [ vf ptr #N ] [ gc map ...]

In addition, there are some number of "globals" (really statics fields in
Java) that can contain references. Given this, it seems like your GC
needs the front-end to provide the following:

1. A way of identifying/iterating all of the global references. It seems
   reasonable for the GC to expose a function like "void gc_add_static_root(void**)"
   which the front-end would emit calls to. Java has 'class initializers'
   which are called to initialize static members of classes (and other
   stuff), so the language front-end could easily insert calls to this
   function in the initializer.
2. A way of getting the size of an object. Initially you can use a header
   word on the object, but eventually, it would be nice if the front-end
   runtime library would provide a function, say
   "unsigned gc_fe_get_object_size(void *object)", which could be called
   by the GC implementation to get this information. With the approach
   above, if the object was a non-array, it would just get the object size
   field from the GC info (using several pointer dereferences from
   'object'. If it's an array, the FE runtime would do something perhaps
   more complex to get this info, but you wouldn't really care what it
   was :). VM's like Java often specialize a number of other cases (like
   strings) as well, and obviously the GC doesn't want to know the details
   of this hackery.
3. A way of iterating over the (language specific) GC map information for
   a heap object. In particular, you need a way to identify the
   offsets of all of the outgoing pointers from the start of a heap object.
   I'm not sure what the best way to do this is (or the best way to
   encode this) but I'm sure you can come up with something. :slight_smile: The most
   naive and inefficient interface (always a good starting place) would be
   to have the front-end provide a callback:
     _Bool gc_fe_is_pointer(void *obj, unsigned offset)
   that the GC can use to probe all of the words in the object to see if
   they are pointers. This can obviously be improved. :slight_smile:

In a naive implementation, the result of these callbacks will be
extremely inefficient, as all of these functions are very small and will
be extremely frequently called when a GC happens. In practice, because
this is LLVM, we can do things a little bit smarter. :slight_smile:

In particular, when a source language's runtime library is built, it will
include two things: all of the GC hooks it is required to implement, and
an actual GC implementation, chosen from runtime/GC directory. When this
is linked together, the link-time optimizer will link all of the little
helper functions above into the GC, inlining and optimizing the result,
resulting in a no overhead solution.

As far as your class project, the following steps make sense to me:

The first step is to get the alloc_loop testcase fully working, as it
doesn't require any of the fancy extra interfaces. This can be done
directly by adding the object header (I believe), and finishing semispace
as it stands today. This should be simple and get you started with the GC
interfaces.

The next step in this is to flush out the interface between the GC and
the language runtime. The GC doesn't want to know anything about how the
FE lays out vtables and GC info, and the front-end doesn't want to know
anything about how the GC implements its algorithm. The
runtime/GC/GCInterface.h file is a start on this, but the callbacks
described above will also need to be added.

The next step is to extend alloc_loop to use these interfaces. It might
be useful to translate it to C code, to make it easier to deal with.

From there, you can add one feature at a time, iterating between adding

features (like global pointers or pointers in heap objects) to the
alloc_loop test, and adding features to the GC. Eventually you will have
the full set of features listed above.

Note that (by hacking on alloc_loop and other testcases you write), you
are basically writing a file that would be generated by a garbage
collected language front-end. Because of this, you get to write all of
the vtables/GC info by hand, and can include the "language runtime"
callbacks directly in the program.

This is a somewhat big task, but can be done incrementally. Let me
emphasize this point: to do this successfully, it is pretty important that
you try to add one little step to the testcase, and add the corresponding
feature to the GC, working a step at a time. As you find that you need to
change the interfaces on either side, go ahead and do so. If you work
incrementally, you will always have something that *works*, and is better
than the previous step. If you add something that is completely horrible
and you decide is a bad idea, you can always revert to something close by
that is known good. For your course project, you obviously want to get
the most implemented possible (ideally all of the above), but if technical
difficulties or other unforseen problems come up, at least you will have
something working and some progress to show.

If you have any questions, please feel free to ask here. I'm pretty busy
over these couple of weeks (so my answers may be a little slow), but I
really want you guys to be successful!

-Chris

> Do you have any intentions for the use of Meta in SemiSpace?
I'm not sure what you mean by 'meta', but if you mean metadata, yes.

I meant the argument in %llvm.gcroot(<ty>** %ptrloc, <ty2>* %metadata)
All examples I've seen set it to null and we have no plans for it.

Given this, it seems like your GC
needs the front-end to provide the following:

1. A way of identifying/iterating all of the global references. It seems
   reasonable for the GC to expose a function like "void gc_add_static_root(void**)"

For now calling llvm.gcroot in main() for every global pointer should suffice.

2. A way of getting the size of an object. Initially you can use a header
   word on the object, but eventually, it would be nice if the front-end
   runtime library would provide a function, say
   "unsigned gc_fe_get_object_size(void *object)", which could be called
   by the GC implementation to get this information.

I imagine a collector moves and deletes blocks of memory as returned by
malloc (AKA llvm_gc_allocate). The collector does not care what the
application does with the block as long as it can find all the pointers
going into and out of the block. Thus the collector should keep track of
the size of each block independently from the frontend. Brian created
a GC internal object that keeps metadata for each block. Given an
address anywhere in a block returned by llvm_gc_allocate the object
returns size, lowest_address_in block, offsets_to_pointers_in_block (see
below).

3. A way of iterating over the (language specific) GC map information for
   a heap object. In particular, you need a way to identify the
   offsets of all of the outgoing pointers from the start of a heap object.
   I'm not sure what the best way to do this is (or the best way to
   encode this) but I'm sure you can come up with something. :slight_smile: The most
   naive and inefficient interface (always a good starting place) would be
   to have the front-end provide a callback:
     _Bool gc_fe_is_pointer(void *obj, unsigned offset)
   that the GC can use to probe all of the words in the object to see if
   they are pointers. This can obviously be improved. :slight_smile:

The problem with gc_fe_is_pointer is that the GC should tell the FE
when it moves a block of memory and the FE will need to update its
structure. This would duplicate much of the block meta data logic in
the GC.
I noticed that llvm.gcroot provides a way for the FE to declare
pointers on the stack. We will implement a equivalent function for
pointers in the heap, gc_register_pointer. It will add entries to the
offsets_to_pointers_in_block for the block containing the pointer.

The weakness of this scheme is that after the pointers in a block have
been registered the FE must always use them as pointers. In other
words the type associated with a block should not change in the blocks
lifetime. I can't think of a sane example of why someone would want to
violate this restriction.

The next step in this is to flush out the interface between the GC and
the language runtime. The GC doesn't want to know anything about how the
FE lays out vtables and GC info

Are vtables created in memory the FE gets from the heap? If we use
gc_register_pointer does the GC need to indirectly access the vtables?

Note that (by hacking on alloc_loop and other testcases you write), you
are basically writing a file that would be generated by a garbage
collected language front-end.

A group working on an OCaml -> LLVM translator showed us their
preliminary work. They are using "malloc <type>" to allocate heap
memory. How difficult would it be to implement "llvm_gc_allocate <type>"
as a call to "llvm_gc_allocate(unsigned %Size)" followed by calls to
gc_register_pointer as needed by <type>? We were oblivious to this
version of malloc until today. Is there a reason why "llvm_gc_allocate
<type>" is not in GCInterface.h beyond the difficulty of representing it
in C?

Thank you for your previous useful reply,
Tom