First, I want to know the symbol resolution semantics. I can imagine that
that is set in stone yet, but just that you guys are still discussing what
would be the best semantics or file format for the linkable wasm object
file. I think by knowing more about the format and semantics, we can give
you guys valuable feedback, as we've been actively working on the linker
for a few years now. (And we know a lot of issues in existing object file
format, so I don't want you guys to copy these failures.)
As Sean pointed out, this looks very different from ELF or COFF in object
construction. Does this mean the linker has to reconstruct everything? The
ELF and COFF linkers are multi-threaded, as each thread can work on
different sections simultaneously when writing to an output file. I wonder
if it's still doable in wasm.
Also, I wonder if there's a way to parallelize symbol resolution. Since
there's no linkable wasm programs, we can take a radical approach.
Another question for Sam is how many "innovation tokens" the wasm folks are
prepared to burn on the object format. E.g. do they not really care as long
as it works, or are they willing to invest significant time in designing
and optimizing it.
There's a lot of interesting directions when creating a new object format
(this was actually one of the initial goals of LLD, way back at the
project's inception!). There are lots of ideas but very little has actually
been explored even to the point of knowing that making X change will give
Y% speedup. So most (all?) of these things are definitely "research" type
Also, looking at LLD's profile, there actually aren't really many things
that immediately stand out as major (order of magnitude) improvements that
are possible. The only obvious major thing that sticks out to me would be
that if the relocations don't affect "layout" (or the wasm equivalent; e.g.
don't require allocating bss or GOT entries), then we do only a single scan
over relocations, which is about 30% of the current
Ultimately we will be limited by disk IO (and if I remember from Rui's
presentation, we're only like 4x slower than `cp`) as long as we don't go
to a model that allows us to transcend writing the output file to disk in
the critical path.
Have you ever considered making the file format more efficiently than ELF
or COFF so that they are linked really fast? For example, in order to avoid
a lot of (possibly very long due to name mangling) symbols, you could store
SHA hashes or something so that linkers are able to handle symbols as an
array of fixed-size elements.
For Sam's benefit, this is something that we've been thinking about for a
while, but we don't really know how much speedup it will really give. (and
IIRC Chandler said at one point that something like this had been tried at
google but the extra per-TU time didn't pay off in the link time or
something like that).
Also, strings are only a big bottleneck (last I checked the profile) in
debug info builds and we already know that that's just a fundamental
problem due to split dwarf not being ubiquitous for ELF at this point in
That is just an example. There are a lot of possible improvements we can
make for a completely new file format.
I guess one thing that would be good to clarify is the design goal of this
wasm linker. (and how interested you are in changing the format at this
If I had to guess, I would guess that ideally the wasm linker would be a
drop-in replacement for a standard native linker, so that changes to user
build systems is minimal.
E.g. the linker invocation would want to stay `ld main.o libfoo.a libbar.a
...` just like in the corresponding native link. (although how does wasm
handle ar? for LLVM bitcode in LTO, that's always a stumbling block)
Sam, could you clarify?
If symbol resolution, "input section" selection, and archive semantics
aren't close to that of native linkers, then it would make it difficult to
port existing C/C++ apps (every app has e.g. an __attribute__((weak)) in
there somewhere, or a C++ inline function so comdat/linkonce is needed,
-- Sean Silva