Byte code libraries and linking

As I have explained in another thread, I am in the process of
porting portions of newlib to LLVM. The target system has no
operating system and a custom processor which may be changed from
compilation to compilation.

We intend to use LLVM as the front end and generate target specific
code from LLVM byte code. For various reasons (whole program
optimization being one of them), it would seem to make sense to
do most of linking at the byte code level.

I succesfully built a byte code version of newlib, so I tried to
use it like this:

  $ llvm-link hello.bc $NEWLIB/libc.a -o linked.bc

but apparently llvm-link only understands byte code files.

Does it even make sense to try to use llvm-link as a ld replacement
this way? Or am I missing something else that could be used?

The other option is of course to link all of libc into one big
byte code file, and use llvm-link like this:

  $ llvm-link hello.bc $NEWLIB/libc.bc -o linked.bc

The only concern here is that this brings in all of libc, but I
suppose it should be easy enough to run dead code elimination on
linked.bc to shake out the unused bits?

To follow up on my own post:

The only concern here is that this brings in all of libc, but I
suppose it should be easy enough to run dead code elimination on
linked.bc to shake out the unused bits?

Cursory browsing of LLVM optimization passes did not turn up
anything directly applicable, at least "opt -adce" and "opt -globaldce"
did not reduce the size of the byte code file much.

What I am looking for is a transformation that would take a
starting point in a byte code file (e.g. %main()), determine
the transitive closure of the functions and globals that can be reached
from there, and drop everything else. In the Lisp world it would
be called a tree shaker, I don't know how you guys would call it.

If such a pass does not exist, it would seem like a perfect candidate
for playing around with pass writing, so I would be happy to
implement it if need be.

try running opt -internalize -globaldce.

In particular, the internalize pass will tell LLVM that you have the "whole program" (i.e. it can mark all functions as static other than main). This allows the globaldce pass to be more aggressive.

-Chris

I succesfully built a byte code version of newlib, so I tried to
use it like this:

  $ llvm-link hello.bc $NEWLIB/libc.a -o linked.bc

but apparently llvm-link only understands byte code files.

Right, llvm-link is the low level bytecode linking interface, not really useful for this sort of use.

Does it even make sense to try to use llvm-link as a ld replacement
this way? Or am I missing something else that could be used?

The 'gccld' tool understands archives. Try:

gccld hello.bc .../libc.a -o linked.bc -link-as-library

"-link-as-library" causes it to emit a .bc file instead of trying to link an app.

-Chris

Chris Lattner wrote:

The 'gccld' tool understands archives.

I knew it would be something simple. Tried it and it does just
what I need. Thanks a bunch!