(wasm-ld) Any fundamental problems with linking a shared wasm library statically?

wasm-ld is currently unable to link a shared wasm library (generated with
`wasm-ld --shared`) with .o files and produce a working executable.

I'm curious if there's a fundamental reason for this, or is this simply
something that wasn't needed and could be implemented if needed.

I think this could be done by

- Resolving "GOT.mem" and "GOT.func" imports and replacing the imports with
  globals to mem/table indices of the imported symbols.

- Applying dynamic relocations ($__wasm_apply_relocs) statically or dynamically
  in a linker-generated start function.

- (I think dylink section is not needed)

For (1), for a GOT.func import, we can find the function's table index (add it
to the table if it's not already added) and replace the import with the constant
of the index. For GOT.mem imports it's similar, we find the location of the
symbol relative to the module's memory base, then replace the import with
`memory_base + offset`.

For (2), in general it's not possible to run an arbitrary wasm function in link
time, but I think relocation functions are basically just a list of statements
(in a C-like language) `memory_base + <offset> = <address of symbol> +
<constant>`. The RHS can also be `<table base> + constant`. So I think it could
be run at link time.

Alternatively, I think we could run $__wasm_call_ctors as first thing in a
linker-generated main function after updating memory_base and table_base of the
imported module and it'd apply the relocations.

Would this make sense? I'm new at wasm and not too experienced in linking (just
a happy user of ld.lld and gold) so it's possible that I'm missing something and
this is not going to work.

Thanks,

Ömer

Hi Omer,

Dynamic linking support in wasm-ld is still a work in progress. When you used the -shared flag to link your shared libraries you should have seen a warning like wasm-ld: warning: creating shared libraries, with -shared, is not yet stable (at least with ToT llvm).

The dynamic linking support that does exist today is mostly to support the emscripten compiler. There is some information on the current status here: https://github.com/WebAssembly/tool-conventions/blob/master/DynamicLinking.md#implementation-status.

I am curious what your use case is. From your description of your proposed solution it sounds like you want to be able to statically link a shared library with object files to produce what is essentially a statically linked executable. Is that right? In that case why not linking with the .a version of the library? If you want to build a static executable you can’t generally do so if you have .so files as input, right? (not with lld or GNU ld anyway). I could be misunderstanding what you are asking for here…

The current ABI imports GOT.func and GOT.mem from the environment and relies on a dynamic linker being present in the embedder (in the case of emscripten this is implemented JavaScript). The actual table and memory offset cannot be known statically by wasm-ld which is why those imports are generated.

I am keen to move the dynamic linking story forward for wasm in llvm and there plans afoot for a new stable ABI based on a new WebAssembly proposal: https://github.com/WebAssembly/module-linking/.

cheers,
sam

Hi Sam,

Thanks for your response and sorry for my late response.

From your description of your proposed solution it sounds like you want to be
able to statically link a shared library with object files to produce what is
essentially a statically linked executable. Is that right?

Correct.

In that case why not linking with the `.a` version of the library?

The problem is I have a Wasm generator that doesn't use LLVM, and generating
statically-linkable .o/.a that wasm-ld can link is a lot of work, with all the
extra sections and relocations etc. Similarly implementing a linker that
understands the wasm I generate and can link it with .o/.a files generated by
LLVM is also a lot of work.

So instead what I do is I generate the code almost as if I'm generating a
single-file executable .wasm, then any C and Rust code that I want to link with
it is compiled to a shared library, which is much easier to link. So far it
works fine but sometimes changing the C/Rust code causes changes in the
generated shared .wasm that breaks my linker (hence my original question
regarding GOT.func/GOT.mem imports).

One obvious idea here is to use LLVM in my code generator, which would generate
.o/.a files that wasm-ld can link. I think the main difficulty with that is it's
actually much easier to generate Wasm than to generate LLVM. It seems weird to
generate a lower-level language (LLVM IR), then compile that to a higher-level
language (Wasm).

(I should mention I don't have a lot of experience generating LLVM so I may be
wrong about it being lower level than Wasm)

The current ABI imports GOT.func and GOT.mem from the environment and relies
on a dynamic linker being present in the embedder (in the case of emscripten
this is implemented JavaScript). The actual table and memory offset cannot
be known statically by wasm-ld which is why those imports are generated.

Is this still the case if I assume no dynamic linking (dload etc. or using
Wasm-specific host functions)? My impression was that I can run functions
$wasm_apply_relocs etc. in runtime and it should work fine. Things would with
loading new modules in runtime but I never do that and can assume that it won't
be done. My implementation works fine currently (or at least I'm not aware of
any bugs).

Thanks,

Ömer

Sam Clegg <sbc@google.com>, 11 Ağu 2020 Sal, 01:08 tarihinde şunu yazdı: