Parametric polymorphism

I think the problem is deeper than that, in that LLVM has no official
concept of a subtype, so I don't see how the idea of polymorphism
could be defined in it.

Parametric polymorphism is different from subtype polymorphism; you
can have one without the other. Parametric polymorphism just means
that you can use type variables (like T) in the IR, which are later
instantiated
to actual types.

max <T extends Comparable>(T a, T b) {
if (a > b) return a; else return b;
}

Also, "Comparable" implies some kind of function associated with the
type in order to actually perform it...

True, but I'm not worried about that; method tables are easy to add once type
substitutions are allowed. Dropping down to a mythical low-level language
that I call template C, we would have:

  struct ComparableDict<T> {
    bool (*equals)(T a, T b);
    bool (*leq)(T a, T b);
    bool (*geq)(T a, T b);
  }

  T max<T>(ComparableDict<T>* dict, T a, T b) {
    if ((*dict->geq)(a,b)) return a; else return b;
  }

This mechanism duplicates the dictionary-passing used in Haskell type classes.
The correct dictionary for any given type is inferred by the
high-level language, but
it is passed as an argument at the llvm level.

Notice that there is no subtyping. The only things I need from llvm are:

(1) The ability to define a parameterized type, e.g. ComparableDict<T>.
(2) The ability to define a function that is parameterized by a type,
e.g. max<T>.
(3) The ability to substitute a type for a type variable, e.g. specialize max
     to max<int>.

In order to get good performance, you will need to instantiate max
with both a type
T, and the dictionary for T. In other words, max must be partially
evaluated with
respect to a given type -and- a given dictionary. While this is
obviously a complication,
partial evaluation with respect to constant arguments is something that llvm is
already quite capable of. If there is not a pass that does it
already, I could easily write
one. The problem is that I have no way of declaring a parameterized type or a
parameterized function within the llvm IR.

Note also that when I talk about ``types'', I am referring to
low-level llvm types --
scalars, records, pointers, etc. I'm not referring to complex
functional types with pattern
matching as found in Haskell, or complex OOP classes as found in JVM
or .NET with
constructors, methods, and all that jazz.

  -DeLesley

I think the problem is deeper than that, in that LLVM has no official
concept of a subtype, so I don't see how the idea of polymorphism
could be defined in it.

Parametric polymorphism is different from subtype polymorphism; you
can have one without the other. Parametric polymorphism just means
that you can use type variables (like T) in the IR, which are later
instantiated
to actual types.

I was thinking of the "T extends Comparable" part, which does involve
subtype polymorphism. Apologies if I'm getting terms mixed up.

True, but I'm not worried about that; method tables are easy to add once type
substitutions are allowed. Dropping down to a mythical low-level language
that I call template C, we would have:

struct ComparableDict<T> {
   bool (*equals)(T a, T b);
   bool (*leq)(T a, T b);
   bool (*geq)(T a, T b);
}

T max<T>(ComparableDict<T>* dict, T a, T b) {
   if ((*dict->geq)(a,b)) return a; else return b;
}

What do the parametrized types give you that you don't get from using
opaque instead of T? Type checking has already passed by the time it
reaches this level, and you would simply add the appropriate bitcasts
when generating the functions to be added to the dictionaries. You
could even safely bitcast integral and floating-point values to pass
them through the opaque to avoid boxing.

In order to get good performance, you will need to instantiate max
with both a type
T, and the dictionary for T. In other words, max must be partially
evaluated with
respect to a given type -and- a given dictionary. While this is
obviously a complication,
partial evaluation with respect to constant arguments is something that llvm is
already quite capable of. If there is not a pass that does it
already, I could easily write
one. The problem is that I have no way of declaring a parameterized type or a
parameterized function within the llvm IR.

I wonder if the "instantiation" could be done instead as a normal pass
taking advantage of a general call-site-context-sensitive
inter-procedural points-to analysis. Getting rid of the indirection
in the generic dictionary seems like the same problem as
devirtualizing method calls, so a unified solution would be nice.

Moreover, specialization should really be done at the codegen level
in order to do it properly. C++ templates are a great example of
why *NOT* to do specialization within the high-level language.

But specialization (in the C++ template sense) is also a great example
of why it's needed in the host language, as is overloading.

The generated code for getSecond() needs to know the size of T in
order to calculate the correct offset into the record. However,
we still don't need to generate a bazillion specialized copies;
we can simply pass the size of T as an argument:

void getSecond(int sizeT, void* pair, void* result) {
   return memcpy(result, ((char*) pair) + sizeT, sizeT);
}

Of course, that only works for POD types, so you still need a
different instantiation for std::string, std::vector, ...

There's no possibility of implementing C++'s complicated lookup and
resolution rules down at the LLVM level, and since you need the
instantiations just to parse C++, using it as an example for why LLVM
should do instantiation seems flawed.

Once you restrict yourself to generics, you're only ever passing
pointers around, which, as you said, is the easy case (relatively),
since you don't need the type information at all once past the
front-end.

Thanks for putting up with the newbie,
~ Scott

I was thinking of the "T extends Comparable" part, which does involve
subtype polymorphism. Apologies if I'm getting terms mixed up.

It was a bad example -- not close enough to actual LLVM. :slight_smile:

What do the parametrized types give you that you don't get from using
opaque instead of T?

Possibly nothing. I don't really understand the limitations of opaque.
Is it possible to declare a structure type { int, opaque, int }, and then
use getelementptr to retrieve the third element? I'm guessing not,
because there's no way for the code generator to calculate the
correct offset.

If T is a type variable, then the type { int, T, int } should be valid.
Eventually of course, the native code generator will need to know
the size of T, but that shouldn't worry other optimization passes
that are only dealing with the IR.

Type checking has already passed by the time it
reaches this level, and you would simply add the appropriate bitcasts
when generating the functions to be added to the dictionaries.

Assuming that all values are 32-bit. Oops -- can't use doubles, long
doubles, or compile to 64-bit architectures. Doh! :wink:

I wonder if the "instantiation" could be done instead as a normal pass

Yes, and it should.

inter-procedural points-to analysis. Getting rid of the indirection
in the generic dictionary seems like the same problem as
devirtualizing method calls, so a unified solution would be nice.

Devirtualization is actually a lot trickier, since it relies on whole
program analysis and type information from the high-level language;
we need to know that a class C has no subclasses in order to
devirtualize its methods. Getting rid of the dictionary indirection is
a simple matter of constant propagation and inlining; no type wizardry
required. (This is one reason why I'm not a big fan of OOP type
systems.)

But specialization (in the C++ template sense) is also a great example
of why it's needed in the host language, as is overloading.

There's no reason why specialization -has- to be done by llvm.
Idiotic languages like C++ can do it themselves, badly, if they want. :wink:
I want llvm to do it for me because it can do a better job. :wink:

void getSecond(int sizeT, void* pair, void* result) {
   return memcpy(result, ((char*) pair) + sizeT, sizeT);
}

Of course, that only works for POD types, so you still need a
different instantiation for std::string, std::vector, ...

Every llvm type is a POD type. It's either a scalar, a vector,
an array, a struct, or a pointer. Every one of those types has a
fixed size. The C++ compiler is supposed to translate complicated
thingies like std::string into a POD type that llvm can understand.

There's no possibility of implementing C++'s complicated lookup and
resolution rules down at the LLVM level,

Why on earth would we want to do anything like C++ lookup? I
tried writing a C++ parser once, and I think the Obama administration
can easily use it as a geneva-convention-friendly alternative to
waterboarding on suspected terrorists. :slight_smile:

Once you restrict yourself to generics, you're only ever passing
pointers around, which, as you said, is the easy case (relatively),
since you don't need the type information at all once past the
front-end.

Yes, and Java generics are dog-slow because they can only use
pointers. Try implementing a generic complex number class in Java,
and watch the two-order-of-magnitude drop in performance on scientific
code.

  -DeLesley

Amen. I haven't proven it with a working HLVM yet but I believe LLVM will make
it possible (even easy?) to generate extremely performant code from heavily
abstracted high-level source.

Complex numbers are a great example where the JVM is terrible and .NET is much
better but still many times slower than necessary.

Try implementing a generic complex number class in Java, and watch the
two-order-of-magnitude drop in performance on scientific code.

Amen. I haven't proven it with a working HLVM yet but I believe LLVM will make
it possible (even easy?) to generate extremely performant code from heavily
abstracted high-level source.

Partial evaluation is a fantastic way to completely eliminate the overhead
of using very abstract, high-level interfaces. Every computation on static
data can be eliminated by a partial evaluator, and in most cases, all of
the overhead of a high-level interface involves information that is statically
known.

Partial evaluation is also old technology and very simple to implement;
I am continually amazed that it hasn't been more widely applied.
I think the biggest issue is that partial evaluation works best for programs
that are free of side-effects. Most imperative programs do a lot of their
computation by mutating the heap, and heap mutations are a lot harder
to track.

In any case, combining partial evaluation in a HLVM with the optimization
passes provided by LLVM should yield blazingly fast code. The goal of
making Java or C# ``as fast as C'' has always seemed somewhat
uninspiring to me. C is actually rather hard to optimize; a high-level DSL
should easily be able to outperform C on any task within its particular
domain.

  -DeLesley