ocaml+llvm

I’m hacking on an llvm backend for the ocaml language.

http://caml.inria.fr/ocaml/

I’d like to solicit some advice regarding the constant data
structures that ocaml’s runtime requires. Rewriting its runtime is
a non-goal.

The biggest problem is a data structure called the frame table, a
simple structure for which LLVM seems ill-prepared. For each call
site in the program, ocaml emits an entry into this table:

key : the return address of the call site
value : the stack offset of every variable live after return

The garbage collector uses this when walking the stack to find
live objects.

– frame table example –
This program will create 3 call sites. 2 are not interesting, but
the other will have 1 live root:

example.ml:

let keeplive x = “”;;

let heapobject = "hello " ^ “world” in
print_endline heapobject;
keeplive heapobject

The interesting call is print_endline heapobject. After the return
of this call, the root heapobject is still live, so the collector
must trace it. To instruct the runtime to do so, the ocaml
compiler emits the following assembly:

from the program text in example.s:

movl %eax, 0(%esp) ; save heapobject at 0(%esp)
call _camlPervasives__print_endline_298
L103: ; mark call’s return address

from the frame table in example.s:
.long L103 ; “entry is for return address L103”
.word 16 ; “entry is 16 bytes in length”
.word 1 ; “1 gcroot follows”
.word 0 ; “find a gcroot at 0(%esp)”

– end –

The major challenges posed by the frame table are:

  1. Making a constant reference to a return address.
  • Labels are not global values, so ocamlopt’s approach is a
    no-go.
  • Is llvm.pcmarker in any way useful? It couldn’t present the
    correct address for a caller-save calling convention.
  • A constant expression like OFFSET + (intptr_t) FUNCTION can
    get the job done, but this requires instruction size
    computations which I can only find for ARM (in support of
    the constant island pass).
  1. Computing the static stack offset of the GC roots.
  • I’m not yet sure whether this is trivial or not.

Additionally:

  1. Identifying the roots live after each call site.
  • Trivially correct: include all llvm.gcroot’d allocas in the
    function.
  • Harder: analyze load/store dominance information vs. the
    call site to eliminate those roots which are not live.
  1. Identifying all return addresses.
  • I think trivial on the MachineFunction representation, but
    is it possible to relate the machine instructions back to
    the LLVM IR, where the liveness analysis must be performed?
  1. Update the frame table constant with information only
    available during codegen.
  • I’m thinking I should populate an on-the-side data structure
    during compilation, and only convert it to a Constant* and
    emit it during an epilogue pass using
    AsmPrinter::EmitGlobalConstant.

That’s the doozy. Unfortunately, I think the runtime requirements
will preclude the use of the basic LLVM toolchain in conjunction
with LLVM ocamlopt, which is unfortunate.

The simpler problem is that each ocaml object exports symbols
bracketing its code and data. An ocaml object absent these symbols
cannot be linked into an executable.

prologue from example.s:
.data
.globl _camlExample__data_begin
_camlExample__data_begin:
.text
.globl _camlExample__code_begin
_camlExample__code_begin:
.data
; data follows

.text
; code follows

epilogue from example.s:
.text
.globl _camlExample__code_end
_camlExample__code_end:
.data
.globl _camlExample__data_end
_camlExample__data_end:
.long 0

I think I should simply write prologue- and epilogue emitter
passes for this boilerplate. That avoids attempting to model these
quirky symbols in the LLVM IR, but again breaks llc compatibility.
Is there a better way?

Thanks,

Gordon

example.ml (112 Bytes)

example.s (1.46 KB)