Type strengthening and type weakening

Has anyone done any experiments with regards to type strengthening or weakening in the context of LLVM?

For example, the GWT compiler does type strengthening - that is, if you are calling a method on an interface or abstract type, and the compiler determines through live variable analysis what the concrete type is, then it goes ahead and re-writes the type information to be the stronger type. The advantage is that it may then be able to do additional optimizations, such as inlining the method or avoiding indirect dispatch.

Conversely, Java's "type erasure" is an extreme case of type weakening - that is, if you have a bunch of functions which operate on similar types (which might be produced through something like C++ template instantiation), you can throw away all of the information about those types that aren't relevant to those functions, and you may find as a result that many of them become identical and can be merged into a single function.

In the context of LLVM, it would be interesting to have a pass that could go through a function and transform the types of the input arguments to be opaque except where the function actually needs the type information - so for example if a pointer argument is never dereferenced it can be converted into a void *.

Similarly, a pass that would generate a hash value for a function, and then see if functions with identical hashes could be merged.

The hardest part would be doing this in a way that preserves the ability to debug the code.

-- Talin

Talin wrote:

For example, the GWT compiler does type strengthening - that is, if you
are calling a method on an interface or abstract type, and the compiler
determines through live variable analysis what the concrete type is,
then it goes ahead and re-writes the type information to be the stronger
type. The advantage is that it may then be able to do additional
optimizations, such as inlining the method or avoiding indirect dispatch.

Conversely, Java's "type erasure" is an extreme case of type weakening -
that is, if you have a bunch of functions which operate on similar types
(which might be produced through something like C++ template
instantiation), you can throw away all of the information about those
types that aren't relevant to those functions, and you may find as a
result that many of them become identical and can be merged into a
single function.

In the context of LLVM, it would be interesting to have a pass that
could go through a function and transform the types of the input
arguments to be opaque except where the function actually needs the type
information - so for example if a pointer argument is never dereferenced
it can be converted into a void *.
  

Given that LLVM doesn't have much information about the relationships
between types, would this be worth it? Such an optimization should IMO
be done by the front-end, which has higher-level information about
types, such as class hierarchies.

Sebastian

Has anyone done any experiments with regards to type strengthening or
weakening in the context of LLVM?

For example, the GWT compiler does type strengthening - that is, if you
are calling a method on an interface or abstract type, and the compiler
determines through live variable analysis what the concrete type is,
then it goes ahead and re-writes the type information to be the stronger
type. The advantage is that it may then be able to do additional
optimizations, such as inlining the method or avoiding indirect dispatch.

Data Structure Analysis (DSA), available through the llvm-poolalloc download, attempts to infer subsets of pointers that are always used in a type-consistent manner. More info is here:

  http://llvm.org/pubs/2007-06-10-PLDI-DSA.html

This can be used to perform the above kind of optimization and several others. DSA is flow-insensitive so it could be strengthened further by flow-sensitive analyses built on top of it.

And on Sep 16, 2009, at 2:12 AM, Sebastian Redl wrote:

Given that LLVM doesn't have much information about the relationships
between types, would this be worth it? Such an optimization should IMO
be done by the front-end, which has higher-level information about
types, such as class hierarchies.

Optimizations that require information in the front-end should be done there. But there are some that can be done effectively at the LLVM level, e.g., see:

  http://llvm.org/pubs/2005-05-21-PLDI-PoolAlloc.html

--Vikram
Associate Professor, Computer Science
University of Illinois at Urbana-Champaign
http://llvm.org/~vadve

Getting this information into the IR is exactly what the extensible metadata proposal is all about :slight_smile:
http://nondot.org/sabre/LLVMNotes/ExtensibleMetadata.txt

-Chris