Tagged (Disjoint) Unions

I have been looking through the LLVM documentation, and I have decided that I would be interested in producing a functional front-end for it, if only to determine how applicable the optimizations that it already performs are to such a language. However, I have run into one obstacle that would make it difficult to write many of the things I would like to do. Is there a way to represent a union of types in LLVM? Specifically, how would you suggest representing something like:

   datatype foo = NUM of int
                > PAIR of int * int
                > STRING of string

I realize I *COULD* do it as a pair of an integer tag and the data, or any of the numerous ways that have been devised for representing such types in other functional languages. The problem that I see is that any optimizations for which LLVM needs types to perform would basically be prohibited this technique. So, I guess what I'm asking is, is there a way of representing tagged union types in LLVM that doesn't prevent providing accurate type information?

-- Ben Chambers

I realize I *COULD* do it as a pair of an integer tag and the data, or any of the numerous ways that have been devised for representing such types in other functional languages.

Yup, you should treat LLVM as a slightly-higher level machine language. As such, you should lower it into an explicit "union" (using pointer casts etc).

The problem that I see is that any optimizations for which LLVM needs types to perform would basically be prohibited this technique. So, I guess what I'm asking is, is there a way of representing tagged union types in LLVM that doesn't prevent providing accurate type information?

No, there isn't.

-Chris