In the creation of dynamic languages we often have to box values together.
For instance, take the following expression:
IntObj c = sqrt((a*a)+(b*b));
Here, most likely, a bytecode interpreter would execute this as
"mul_ints", "add_ints", "sqrt", etc. Inside these primitive functions
we would have to unwrap our IntObj types, add the values, allocate a
new object and return that to the function. In the above example, we
could probably expect around 4 allocations, and 7 unboxing operations.
Now granted if my lanugage is running as a bytecode interpreter, I can
speed it up simply by having LLVM call my functions in order, and
perhaps even in-lining all the bytecode operations into a single
function. But even then, I'm still left with the 4 allocations and 7
unboxings (is that even a word?).
I know other compiler projects, such as PyPy have allocation removal
where the optimization passes see that we only use the result of an
allocation a single time. Thinking that LLVM may do this as well, I
tried this simple test on in-browser LLVM compiler:
LLVM cannot remove the malloc calls, as malloc() has a sideeffect and that
would be changing the behaviour of the program.
Apart from that, the problem with unboxing in dynamic languages is knowing
beforehand which function to dispatch to. mul_ints or mul_floats, for
example? What if a particular type has overridden the + operator, etc etc.
So your code normally ends up bouncing through several functions making
analysis difficult.