A few days ago we were discussing object files and linking on IRC. I
had been thinking about working on this for a while, and this
discussion finally got me to do it.
Attached are patches of a preliminary implementation of a generic
object file library, and a few changes to llvm-nm to make use of it.
The main goal in the design of the API is to allow a fast
implementation by avoiding memory allocation and repeated or unneeded
parsing of the object file. This is currently achieved in part by
allowing symbol incrementation to be simple a simple pointer
increment. And most object files will support random access to the
symbol and section table.
The API needs lots of work. Some of the current problems include:
* Error handling.
* Symbols only.
* Can only access one symbol table efficiently.
* Read only.
* Weird interface between SymbolRef and ObjectFile.
My current plan to support modifying and creating new object files is
to have a generic internal representation that has the same external
API as everything else. When the API client calls any function that
modifies the object file, a "changes" object is created that stores
all of the changes required when outputting the file. This changes
object will be transparent to the client, and would make the API calls
required to write an object file out in a different format simple. It
would also allow an optimal implementation if it is being written out
in the same format, as the specific object file format class knows
exactly what is already in the file and where.
An alternative to this is to fully parse every object file into an
intermediate representation on load. This would simplify the library,
but would come at a steep performance cost, and tools seldom modify an
object file compared to reading it.
I currently envision this library being used in the following ways:
* An ld and link.exe compatible linker.
* A loader.
* Add support to lli for loading dynamic libraries referenced from .bc
files when JITing.
* Executable compression, encryption.
I decided to make a new library instead of adding this to MC because I
felt that MC is not really designed for generic object file handling.
It is and should be designed for working with machine code.
And a test of the largest object file generated by clang -g while
bigcheese@CHIBISERV /tmp> nm --version
GNU nm (GNU Binutils for Ubuntu) 2.20.1-system.20100303
Copyright 2009 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.
bigcheese@CHIBISERV /tmp> llvm-nm -version
Low Level Virtual Machine (http://llvm.org/):
llvm version 2.8git
Host CPU: k8-sse3
bigcheese@CHIBISERV /tmp> time nm -a
bigcheese@CHIBISERV /tmp> time llvm-nm -a
- Michael Spencer
object-file-library-patches.zip (25.9 KB)