Hi Talin,
Many languages support the notion of a “value type”. Value types are always passed by value, unlike reference types which are always passed by pointer. An example is the “struct” type in C#. Another example is a “tuple” type. A value type which is a local variable lives on the stack as an alloca, not on the heap. When a function is called with a value type as argument, the callee gets its own copy of the argument, rather than sharing a pointer with the caller.
Yes.
Value types are represented in LLVM using structs, and may contain pointer fields which need to be traced.
Yes.
The way that I handle non-pointer types is to generate an array of field offsets (containing the offset of each pointer field within the struct) as the metadata argument to llvm.gcroot. This meta argument is then processed in my GCStrategy, where I add the stack root offset to the offsets in the field offset array, which yields the stack offsets of the actual pointers in the call frame.
Did you think of the alternative of calling llvm.gcroot on pointers in this struct? This requires to change the verifier to support non-alloca pointers in llvm.gcroot, but it makes the solution more general and cleaner: pointers given to llvm.gcroot only point to objects in the heap.
I think that, originally, the purpose of the second argument of llvm.gcroot was to emit static type information.
Let me give you a more complicated example to see why this won’t work:
Imagine I have a discriminated union type, whose type declaration looks like this:
var x:int or String.
The variable ‘x’ can be either an integer or a reference to a string object. In LLVM assembly, this data structure is represented by the following struct:
{ i1, String * }
The ‘i1’ field (the ‘disciminator’) is used to determine what kind of value is currently stored in the union. If it’s 0, then it’s an int, and the structure will be cast to { i8, int } before extracting the value. If it’s 1, then it’s a String pointer. The compiler does not allow access to the wrong type - if the value it 0, the language does not allow you to extract the value as a String.
Now, suppose we declare this as a local variable, so the union struct is contained within an alloca. We want to declare the String pointer as a root, but only if the discriminator is not 0. We can’t determine this at compile time, instead the collector has to be smart enough to examine the union and determine whether it contains a pointer or not.
In my compiler, what I do is to generate a callback function that can trace the object. This callback function is contained within a data structure that is passed as the metadata argument to llvm.gcroot.
So my code looks like this (bit casts omitted for simplicity):
%int_or_string = type { i8, String * }
%x = alloca %int_or_string
call void llvm.gcroot( i8 ** x, i8* @.tracetable.int_or_string)
Where ‘.tracetable.int_or_string’ is the static type information for the “int or string” type, containing both the field offsets and the callback function to test the value of the disciminator.
Note that if I only declared the pointer as a root, then this wouldn’t work - the collector needs access to the entire data structure in order to trace the object correctly.
Also, I think this is the right solution - llvm.gcroot is only responsible for the offset of the base of the alloca, not for any of it’s internal structure, which is the responsibility of the compiler and the GCStrategy.