[lld] Has anybody ever run into the Solaris linker before?

Recently LLD made it to the front page of HN (yay!): https://news.ycombinator.com/item?id=13670458

This comment about the Solaris linker surprised me: https://news.ycombinator.com/item?id=13672364

“”"

To me, the biggest advantage is cross compiling

Not all system linkers have this problem. For example, Solaris ld(1) is perfectly capable of cross-linking any valid ELF file.
“”"

This got me interested in looking at what the solaris linker was like.
I had known that Solaris had done quite a few innovations in the linker space, but I had never actually looked at their iinker (the time I spent interacting with Solaris/illumos was related to DTrace stuff and happened before I started working on linkers).

The basic layout seems to be:
http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/sgs/ld/common/ld.c

This is where main is.

The actual linker has a “main in a library” type interface that is called into from main() by:

718 /* Call the libld entry point for the specified ELFCLASS */
719 if (class == ELFCLASS64)
720 return (ld64_main(argc, argv, mach));
721 else
722 return (ld32_main(argc, argv, mach));

The rest of the code is in “libld”. The header is here:
http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/sgs/include/libld.h

There seems to be a huge “context” struct struct ofl_desc: http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/sgs/include/libld.h#241
(it does have a couple globals in http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/sgs/libld/common/globals.c
And also the “Target” struct is a global apparently: http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/sgs/libld/common/_libld.h#47
)

The code itself is mostly here:
http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/sgs/libld/common/

The libld “main” function seems to be in:
http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/sgs/libld/common/ldmain.c#144

The function that handles individual sections in object files is process_elf in:
http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/sgs/libld/common/files.c#2525

http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/sgs/libld/common/README.XLINK

(one interesting thing is that besides being used for the regular ld command line program, libld.so is also used implicitly if you try to dlopen a relocatable object (which is a bit weird))

I haven’t traced through all of it yet. Has anybody stared at this code? How does it compare architecturally with LLD? If you’re looking through for the first time, please post any findings/insights in this thread so others can follow along.

(for the record, illumos-gate/usr/src/cmd/sgs/libld/common % wc -l *.c gives about 45k lines of code, so LLD/ELF is still quite a bit slimmer, though giving solaris libld a factor of 2 handicap due to using C vs C++ brings them basically to parity)

There even seems to be a simple “libelf”: http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/sgs/libelf/common/
Main interface is the “struct Elf”: http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/sgs/libelf/common/decl.h#270
struct Elf also handles archives.

It seems to be roughly libobject-like, though it seems to support writing.
There are some “demos” here: http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/sgs/libelf/demo/ (note the README)

It seems like it tries to paper over the fact that writing can cause it to have to do O(the entire file) work if it needs to slide sections around / fixup offsets (haven’t looked super closely though). E.g. it seems to try to transparently handle the case where you append extra data to a section. It doesn’t seem to handle “symbols” explicitly (just exposes the raw section data), so it doesn’t need to worry about operations like “convert a local symbol to weak” which requires rewriting all sorts of stuff.

– Sean Silva

I just skimmed through a few files. What I noticed first is that the code is pretty well-commented. I appreciate whoever wrote this. A few random facts I found from the source code:

  • it uses AVL tree for the symbol table
  • –wrap is implemented using an additional AVL tree. So if --wrap is in use, a symbol name is translated by the additional AVL tree before looking up the main symbol table
  • it exits at end of main without freeing up the memory for performance reasons
  • the default entry points are _start and main. (In LLD _start is the only default entry point name.)

I just skimmed through a few files. What I noticed first is that the code
is pretty well-commented. I appreciate whoever wrote this.

Looking at the history, I believe the people that wrote it are Rod Evans
(rie; Rod.Evans@Sun.COM) and Ali Bahrami (ab196087; Ali.Bahrami@Sun.COM),
though those emails are likely defunct now.

http://src.illumos.org/source/history/illumos-gate/usr/src/cmd/sgs/libld/common/

Rod seems to have had a "Surfing With a Linker Alien" blog series:
https://blogs.oracle.com/rie/

This one contains a good overview of the solaris SGS (software generation
subsystem; i.e. linkers, loaders, etc.):
https://blogs.oracle.com/rie/entry/the_link_editors_a_source

Another good solaris resource on linkers that I had forgotten about is
"Solaris Linker and Libraries Guide":
http://docs.oracle.com/cd/E19253-01/817-1984/

-- Sean Silva