Accounting for code size

In my quest to account for memory, I've now come to the in-memory IR,
and the generated code. I want to book the generated code memory
against the agent that is generating the code.

I see that LLVM's Function class [1] has a size function; what does
this represent and can I use it to account for the space used by the
in-memory IR?

As for generated code, the JIT [2] class simply returns a "void *"
when generating code for that function, with no size information that
I can see.

MachineCodeEmitter [3] is obtainable from the JIT class, and it
contains "BufferBegin", "BufferEnd", and "CurBufferPtr" as protected
members; if they were available, determining the generated code size
might be possible via some pointer arithmetic. Is there another way I
might be missing? Or would I have to subclass JITEmitter and add a
method to perform this calculation?

Finally, can I free the memory used by the in-memory IR after the code
is generated?

Sandro

[1] http://llvm.org/doxygen/classllvm_1_1Function.html#a27
[2] http://llvm.org/doxygen/classllvm_1_1JIT.html
[3] http://llvm.org/doxygen/classllvm_1_1MachineCodeEmitter.html

In my quest to account for memory, I've now come to the in-memory IR,
and the generated code. I want to book the generated code memory
against the agent that is generating the code.

I see that LLVM's Function class [1] has a size function; what does
this represent and can I use it to account for the space used by the
in-memory IR?

It returns the number of basic blocks in the function. Size on a basic block returns the number of instructions in it.

As for generated code, the JIT [2] class simply returns a "void *"
when generating code for that function, with no size information that
I can see.

Right. If you pass -debug-only=jit to lli (before the bc file) it will print out the size of each function as it is compiled.

MachineCodeEmitter [3] is obtainable from the JIT class, and it
contains "BufferBegin", "BufferEnd", and "CurBufferPtr" as protected
members; if they were available, determining the generated code size
might be possible via some pointer arithmetic. Is there another way I
might be missing? Or would I have to subclass JITEmitter and add a
method to perform this calculation?

You could do that, it will tell you the size of all of the machine code jit'd.

Finally, can I free the memory used by the in-memory IR after the code
is generated?

Yes, you can call F->deleteBody() after JIT'ing F.

-Chris

> MachineCodeEmitter [3] is obtainable from the JIT class, and it
> contains "BufferBegin", "BufferEnd", and "CurBufferPtr" as protected
> members; if they were available, determining the generated code size
> might be possible via some pointer arithmetic. Is there another way I
> might be missing? Or would I have to subclass JITEmitter and add a
> method to perform this calculation?

You could do that, it will tell you the size of all of the machine
code jit'd.

Right, unless I do it before and after jitting the new function and
subtract them, in which case I'll get the size of the new function
which I can then book to the agent.

> Finally, can I free the memory used by the in-memory IR after the code
> is generated?

Yes, you can call F->deleteBody() after JIT'ing F.

Sorry, I meant to ask whether it's still necessary to keep F around,
ie. to delete generated code. Is there a standard approach to garbage
collecting code in LLVM?

Sandro

Machine code in the JIT buffer or the LLVM IR itself?

-Chris

Assuming I don't need to keep around the IR version of a function,
then only the machine code.

Sandro

Use ExecutionEngine::freeMachineCodeForFunction to deallocate code from the JIT buffer.

-Chris

Right, so I need to identify the unused functions myself in some way,
and LLVM has no facility for querying what code is still referenced/in
use, etc.

Sandro