Hi again Terence,
Sorry for the long questions…gotta figure this out.
Not a problem!
Ok, I might be getting this from the assembly code. … From that, it will push/pop in functions? If so, that’s easy enough.
Yup! Sounds like you’ve got it.
Yup, what i was missing and what somebody should add to the doc is that “shadow-stack” adds a preamble/postamble snippet to each function that must bind with
StackEntry *llvm_gc_root_chain;
wherever you choose to define it. I put into my GC.c file.
Further, that shadow-stack snippet generation assumes the following structures for tracking roots:
typedef struct FrameMap FrameMap;
struct FrameMap {
int32_t NumRoots; // Number of roots in stack frame.
int32_t NumMeta; // Number of metadata descriptors. May be <
NumRoots.
void *Meta; // May be absent for roots without metadata.
};
typedef struct StackEntry StackEntry;
struct StackEntry {
StackEntry *Next; // Caller’s stack entry.
const FrameMap *Map; // Pointer to constant FrameMap.
void *Roots; // Stack roots (in-place array).
};
The doc says compiler / runtime must agree, but not what the structs are…Seems like those few lines above would make everything clear. I don’t have write access to svn, but I plan on a big chapter full of ANTLR → LLVM examples in my DSL problem solving book.
If you’d like to propose clarified language once you’ve wrapped your head around the framework, I’d be happy to incorporate it. Most ideally, submit a patch against GarbageCollection.html in http://llvm.org/svn/llvm-project/llvm/trunk/docs/.
What I was/am missing is the explicit link between types and variables in a GC.c file and the generated machine code. If I can get that last explicit link, I’m off to the races.
You mean, how do you know what sort of object you’re tracing?
I assumed that I needed to generate my object maps or at least a list of pointers for each object type. Related to that, i have two important questions:
- How do I know the offset (due to alignment/padding by LLVM) of a pointer within an object using {…} struct type? GEP instruction gets an address, of course, but how does my C collector compute these. Do I need to make a metadata struct and fill it with GEP instructions? I guess that makes sense.
You can use a constant expression to compute this. Specifically:
i32 ptrtoint(gep %mytype* null, i32 0, i32 ??? to i32)
Where ??? is the number of the field within the struct. (Could be a list of indices.)
- How do I know how big a struct is? I have my gc_allocate() method
Note: There’s absolutely nothing special about the name @llvm_gc_allocate, it’s just a gentle suggestion. In fact, if you’re using the “model 1” heap tracing strategy, you probably want to use something like this:
%class.object* @alloc_object(%runtime.class* %vtable, i32 %nbytes)
This way, the vtable pointer is guaranteed to be initialized before the collector can possibly see it. (A null vtable could be defended against in the collector, but that’s an expensive spot to put a branch.)
but I have no idea how big the struct will be; i see now sizeof. Alignment makes it unclear how big something is; it’s >= size of elements like i32 but how much bigger than packed struct is it? I.e.,
%struct.A = type {i32 x, [10 x i32]*}
define void @foo() gc “shadow-stack” {
%s = alloca %struct.A ; this knows how big struct.A is
%a = call i32* @llvm_gc_allocate(i32 11); this does not know. is
it 11 or more?
ret void
}
You can again use a constant expression here. Specifically:
i32 ptrtoint(gep %mytype* null, i32 1 to i32)
ConstantExpr has a helper similar to sizeOf(Type*) which build this expression.
• If you have a type forest (as in C or C++) with optional vtables, then no such assumption is possible, and you can include type layout information in the %metadata parameter to @llvm.gcroot. The FrameMap type includes this data.
Ok, so I pass it an arbitrary struct pointer and it just gives it back later for me to peruse, right?
Yep! You can use any constant pointer (which means: any global variable, alias, or function). For example, something like this:
class IntList {
int count;
ListEntry head;
ListEntry tail;
};
class IntListEntry {
IntListEntry prev;
IntListEntry next;
int value;
};
int SumOfList(IntList list) {
return RecursiveSumOfList(list.head, 0);
}
int RecursiveSumOfList(IntList entry, int sumSoFar) {
if (entry == null)
return sumSoFar;
return RecursiveSumOfList(entry.next, sumSoFar + entry.value);
}
Could be hypothetically translated into the following LLVM IR. Note the following aspects:
- Class metadata is emitted as constants to allow the GC to trace the heap.
- Offsets to reference fields are calculated portably using the ‘gep null’ trick.
- You don’t want to write these constants by hand.
- It would also be possible to tag roots with a trace function with a prototype like “void(Object*, void ()(Class, Object*))” which invokes the callback for each reference in the parameter object.- This is the “model 2” tagless object model, where neither tag bits nor a vtable pointer are present in all objects.
- Each call @llvm.gcroot must specify the metadata pointer so the collector knows how to trace the object (or find the vtable, where applicable).- This is somewhat conservative with respect to certain GC behaviors. More elaboration is inline.
- Ephemeral temporaries get gcroots. This could be unnecessary for a single-threaded collector if the temp isn’t alive across a call.
- Each use of a reference variable or temp is reloaded from the gcroot. This is necessary only for copying collectors.
- However, @llvm.readbarrier and @llvm.writebarrier are not used. Using them would be more conservative still, but quite verbose.
; This is the runtime-defined type for class metadata.
; Here, it contains only GC tracing information.
;
; namespace runtime {
; class Class {
; unsigned NumReferences;
; struct {
; unsigned ReferenceOffset;
; Class *ReferenceType;
; } References[NumReferences]; // Flexible array member.
; };
; }
;
; The recursive nature of GC references should be clear.
; The type would need to be more complex to handle arrays and type hierarchies.
%runtime.Class = type {i32, [0 x {i32, %runtime.Class*}]}
declare void @llvm.gcroot(i8** %root, i8* %metadata)
; User-defined datatype:
;
; class IntList {
; int count;
; IntListEntry head;
; IntListEntry tail;
; }
%class.IntList = type {i32, %class.IntListEntry*, %class.IntListEntry*}
@class_IntList = constant {i32, [2 x {i32, %runtime.Class*}]} {
i32 0,
[2 x {i32, %runtime.Class*}] [
{i32, %runtime.Class*} {
i32 ptrtoint(%class.IntListEntry** getelementptr(%class.IntList* null, i32 0, i32 1) to i32),
%runtime.Class* bitcast({i32, [2 x {i32, %runtime.Class*}]}* @class_IntListEntry to %runtime.Class*)
},
{i32, %runtime.Class*} {
i32 ptrtoint(%class.IntListEntry** getelementptr(%class.IntList* null, i32 0, i32 2) to i32),
%runtime.Class* bitcast({i32, [2 x {i32, %runtime.Class*}]}* @class_IntListEntry to %runtime.Class*)
}
]
}
; User-defined datatype:
;
; class IntListEntry {
; IntListEntry prev;
; IntListEntry next;
; int value;
; }
%class.IntListEntry = type {%class.IntListEntry*, %class.IntListEntry*, i32}
@class_IntListEntry = constant {i32, [2 x {i32, %runtime.Class*}]} {
i32 0,
[2 x {i32, %runtime.Class*}] [
{i32, %runtime.Class*} {
i32 ptrtoint(%class.IntListEntry** getelementptr(%class.IntListEntry* null, i32 0, i32 0) to i32),
%runtime.Class* bitcast({i32, [2 x {i32, %runtime.Class*}]}* @class_IntListEntry to %runtime.Class*)
},
{i32, %runtime.Class*} {
i32 ptrtoint(%class.IntListEntry** getelementptr(%class.IntListEntry* null, i32 0, i32 1) to i32),
%runtime.Class* bitcast({i32, [2 x {i32, %runtime.Class*}]}* @class_IntListEntry to %runtime.Class*)
}
]
}
; User-defined function:
;
; int SumOfList(IntList list) {
; return RecursiveSumOfList(list.head, 0);
; }
define i32 @SumOfList(%class.IntList* %list) gc “shadow-stack” {
entry:
; Prologue. 2 reference variables (including temps): list and list.head.
; 1. Create an alloca for each variable, including parameters.
%list.var = alloca %class.IntList*
%head.var = alloca %class.IntListEntry*
; 2. Call @llvm.gcroot for it.
%list.root = bitcast %class.IntList** %list.var to i8**
%head.root = bitcast %class.IntListEntry** %head.var to i8**
call void @llvm.gcroot(i8** %list.root, i8* bitcast({i32, [2 x {i32, %runtime.Class*}]}* @class_IntList to i8*))
call void @llvm.gcroot(i8** %head.root, i8* bitcast({i32, [2 x {i32, %runtime.Class*}]}* @class_IntListEntry to i8*))
; 3. Initialize parameters. Note: No need to null-init roots; the GC does so if necessary.
store %class.IntList* %list, %class.IntList** %list.var
; End prologue.
; This load is possibly unnecessary, depending on the GC.
; If the GC is copying, then the pointer may change whenever the GC is invoked.
; But if the GC is non-moving, there’s no need to reload from head.var.
; Eliding the load will generate faster code.
%list.1 = load %class.IntList** %list.var
%head.ptr = getelementptr %class.IntList* %list.1, i32 0, i32 1
%head = load %class.IntListEntry** %head.ptr
; This store (and thus %head.var gcroot) is possibly unnecessary, depending on the GC.
; In a threaded collector, it’s possible that the function call site could’ve been patched
; to enlist the thread in a ‘stop the world’ event. (This is necessary to bound collection
; pause times.) Thus, the act of calling a function could invoke the collector, in which
; case temps like %head would need to be rooted.
store %class.IntListEntry* %head, %class.IntListEntry** %head.var
%head.1 = load %class.IntListEntry** %head.var
%sum = tail call i32 @RecursiveSumOfList(%class.IntListEntry* %head, i32 0)
ret i32 %sum
}
; User-defined function:
;
; int SumOfList(IntList entry, int sumSoFar) {
; if (entry == null)
; return sumSoFar;
; return RecursiveSumOfList(entry.next, sumSoFar + entry.value);
; }
define i32 @RecursiveSumOfList(%class.IntListEntry* %entry, i32 %sumSoFar) gc “shadow-stack” {
entry:
; Prologue. 2 reference variables (including temps): entry and entry.value
%entry.var = alloca %class.IntListEntry*
%next.var = alloca %class.IntListEntry*
%entry.root = bitcast %class.IntListEntry** %entry.var to i8**
%next.root = bitcast %class.IntListEntry** %next.var to i8**
call void @llvm.gcroot(i8** %entry.root, i8* bitcast({i32, [2 x {i32, %runtime.Class*}]}* @class_IntListEntry to i8*))
call void @llvm.gcroot(i8** %next.root, i8* bitcast({i32, [2 x {i32, %runtime.Class*}]}* @class_IntListEntry to i8*))
store %class.IntListEntry* %entry, %class.IntListEntry** %entry.var
%entry.0 = load %class.IntListEntry** %entry.var
%if = icmp eq %class.IntListEntry* %entry.0, null
br i1 %if, label %then, label %else
then:
ret i32 %sumSoFar
else:
%next.ptr = getelementptr %class.IntListEntry* %entry, i32 0, i32 0
%next = load %class.IntListEntry** %next.ptr
store %class.IntListEntry* %next, %class.IntListEntry** %next.var
; Note: entry.value is not a reference; no need to root it.
%value.ptr = getelementptr %class.IntListEntry* %entry, i32 0, i32 2
%value = load i32* %value.ptr
%temp = add i32 %sumSoFar, %value
%next.1 = load %class.IntListEntry** %next.var
%sum = tail call i32 @RecursiveSumOfList(%class.IntListEntry* %next.1, i32 %temp)
ret i32 %sum
}
As you can see, there are many dimensions of flexibility here.
— Gordon