Ideas for annotating types?

I’d like to figure out a clean way to annotate LLVM types with additional information. This is related to the garbage collection discussion - I’d like to figure out a good way to do this, so I can add it to the proposal.

Basically what I want is to be able to take a Constant and associate it with a type, such that (1) any backend pass that has a reference to the type can quickly and efficiently get a reference to the associated Constant, and (2) this information can be serialized into a bitcode file along with the rest of the module contents. (By “quickly and efficiently” I mean that a backend pass should be able to do a test on every value in a given module and check to see if there associated data with that value’s type, and not significantly affect compilation time.)

For garbage collection, I’d use this feature to annotate each type with it’s trace table - basically it’s the same information that I pass to llvm.gcroot() today, but associated with a type rather than with a value.

However, I would imagine other uses for this facility. What this effectively is “metadata for types”, although unlike metadata nodes this data would be immutable.

Also, it’s OK if the solution requires changes to LLVM, as long as it doesn’t significantly impact performance for people who aren’t using this feature.

Hi Talin,

The new type system allows you to name types and, if I got it rght,
that's unique and immutable.

You could use that fact, plus some metadata, to group your values and
use one annotation per group.

Should be easy to iterate through all values in a given function
testing for a given (unique) type.

cheers,
--renato

However, I would imagine other uses for this facility. What this effectively
is “metadata for types”, although unlike metadata nodes this data would be
immutable.

Hi Talin,

The new type system allows you to name types and, if I got it rght,
that’s unique and immutable.

You could use that fact, plus some metadata, to group your values and
use one annotation per group.

Should be easy to iterate through all values in a given function
testing for a given (unique) type.

So essentially use a dummy named type as a container to hold other types that share a common annotation? That’s an interesting idea. And then I presume that on loading the module you’d use that to build a map of unique type pointer to annotation pointer.

So basically you’d have a pair of utility classes in LLVM that would read and write this data structure into a module. Then on top of that you’d build the GC primitives that annotate a given type and retrieve the annotation.

Hmmmm…

So based on your suggestion, I’ve sketched out the following interface:

class TypeAnnotationMap {

public:

/** Create a new type annotation map with the specified name. */

static TypeAnnotation * create(StringRef name);

/** Load a type annotation map from a module. */

static TypeAnnotation * load(Module * m, StringRef name);

/** Serialize this annotation map as a global variable in module ‘m’.

This will replace any pre-existing variable of the same name. */

void store(Module * m);

/** Return the name of this type annotation map. */

StringRef name() const;

/** Set the value of the annotation for the input type ‘ty’. This replaces any

pre-existing value associated with ‘ty’. */

void put(const Type * ty, const Constant * val);

/** Get the value of the annotation for type ‘ty’. */

const Constant * get(const Type * ty) const;

/** Remove any value associated with type ‘ty’. */

void remove(const Type * ty);

};

An example use would be as follows:

// Create a new map

TypeAnnotationMap * aMap = TypeAnnotationMap::create(“llvm.gc.root”);

// Add an entry to the map.

aMap->put(ty, data);

// Store as a global variable in a module

aMap->store(module)l

Internally the class has a map of Type pointers to Constant pointers. To serialize the map, it converts the internal map to a list of types (which are represented as members of a struct), and a list of constants. It then creates a tuple whose first member is a NULL pointer to the struct type, and whose second member is a pointer to the list of constants. It creates a global variable in the module whose name is the name of the map, and sets it’s initializer to be that tuple.

A module can have an arbitrary number of such maps, each with a different map name.

Does that sound about right?