Box removal

In the creation of dynamic languages we often have to box values together.

For instance, take the following expression:

IntObj c = sqrt((a*a)+(b*b));

Here, most likely, a bytecode interpreter would execute this as
"mul_ints", "add_ints", "sqrt", etc. Inside these primitive functions
we would have to unwrap our IntObj types, add the values, allocate a
new object and return that to the function. In the above example, we
could probably expect around 4 allocations, and 7 unboxing operations.
Now granted if my lanugage is running as a bytecode interpreter, I can
speed it up simply by having LLVM call my functions in order, and
perhaps even in-lining all the bytecode operations into a single
function. But even then, I'm still left with the 4 allocations and 7
unboxings (is that even a word?).

I know other compiler projects, such as PyPy have allocation removal
where the optimization passes see that we only use the result of an
allocation a single time. Thinking that LLVM may do this as well, I
tried this simple test on in-browser LLVM compiler:

Hi Timothy,

LLVM cannot remove the malloc calls, as malloc() has a sideeffect and that
would be changing the behaviour of the program.

Apart from that, the problem with unboxing in dynamic languages is knowing
beforehand which function to dispatch to. mul_ints or mul_floats, for
example? What if a particular type has overridden the + operator, etc etc.
So your code normally ends up bouncing through several functions making
analysis difficult.

James

Wrong; LLVM can and will remove calls to malloc(). There isn't any
way for a program to observe whether a particular malloc() call runs.

-Eli

There's no malloc remover, but there is a malloc/free remover. If you
fix the C code:

#include <stdio.h>
#include <stdlib.h>

typedef struct Foo
{
int *x;
int x2;
}Foo;

int main(int argc, char **argv) {
Foo *f = (Foo *)malloc(sizeof(Foo));
f->x = (int *)malloc(sizeof(int));
*f->x = 10;
int i = *f->x;
free(f->x);
free(f);
return i;
}

Then it compiles exactly the way you want it to

; ModuleID = '/tmp/webcompile/_26915_0.bc'
target datalayout =
"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-linux-gnu"

define i32 @main(i32 %argc, i8** nocapture %argv) nounwind readnone {
entry:
  ret i32 10
}

~ Scott