Wildcard patterns in `--undefined` linker option

Hi,

I got a feature request from an internal customer of lld, but I don’t know whether we should implement it or not, so I’d like to get opinions from people on this mailing list.

The feature request is to allow wildcard patterns in the --undefined option. --undefined foo (or -u foo for short) makes the linker to pull out an object file from a static library if the file defines symbol foo. So, by allowing wildcard patterns, you can pull out all object files defining some JNI symbols (which start with “Java_”) from static archives by specifying -u "Java_*", for example.

This seems mildly useful to me, but it comes with a cost. Currently, -u is literally as fast as a single hash lookup. If you allow wildcard patterns in -u, you have to attempt a wildcard pattern match against all symbols in the symbol table, which can be expensive.

I’m also not sure how useful it will actually be. The above JNI case is somewhat convincing, but that’s just one use case, and if there’s only one use case, adding a new feature for that particular case is probably not a very good idea.

Does anyone have any opinion on whether we should support this or not?

Thanks,
Rui

Hi,

I got a feature request from an internal customer of lld, but I don't know whether we should implement it or not, so I'd like to get opinions from people on this mailing list.

The feature request is to allow wildcard patterns in the `--undefined` option. `--undefined foo` (or `-u foo` for short) makes the linker to pull out an object file from a static library if the file defines symbol foo. So, by allowing wildcard patterns, you can pull out all object files defining some JNI symbols (which start with "Java_") from static archives by specifying `-u "Java_*"`, for example.

This seems mildly useful to me, but it comes with a cost. Currently, `-u` is literally as fast as a single hash lookup. If you allow wildcard patterns in `-u`, you have to attempt a wildcard pattern match against all symbols in the symbol table, which can be expensive.

I'm also not sure how useful it will actually be. The above JNI case is somewhat convincing, but that's just one use case, and if there's only one use case, adding a new feature for that particular case is probably not a very good idea.

Does anyone have any opinion on whether we should support this or not?

Arm's proprietary linker supported a similar wildcard feature for the
various section and symbol commands. My opinion was that it was very
useful for a small number of cases. Usually where a customer's set of
symbols to operate on was changing frequently during development and
with a simple naming convention and a wildcard they could save quite a
bit of effort. Their alternative was to either manually maintain the
list of symbols or write a tool to generate it.

We went down the approach of a fast path when there where no wildcards
and a slow path when there was. Aside from the check that there was a
wildcard it wasn't too much extra overhead for the fast path.

Peter

Hi,

I got a feature request from an internal customer of lld, but I don’t know whether we should implement it or not, so I’d like to get opinions from people on this mailing list.

The feature request is to allow wildcard patterns in the --undefined option. --undefined foo (or -u foo for short) makes the linker to pull out an object file from a static library if the file defines symbol foo. So, by allowing wildcard patterns, you can pull out all object files defining some JNI symbols (which start with “Java_”) from static archives by specifying -u "Java_*", for example.

This seems mildly useful to me, but it comes with a cost. Currently, -u is literally as fast as a single hash lookup. If you allow wildcard patterns in -u, you have to attempt a wildcard pattern match against all symbols in the symbol table, which can be expensive.

I’m also not sure how useful it will actually be. The above JNI case is somewhat convincing, but that’s just one use case, and if there’s only one use case, adding a new feature for that particular case is probably not a very good idea.

Does anyone have any opinion on whether we should support this or not?

Arm’s proprietary linker supported a similar wildcard feature for the
various section and symbol commands. My opinion was that it was very
useful for a small number of cases. Usually where a customer’s set of
symbols to operate on was changing frequently during development and
with a simple naming convention and a wildcard they could save quite a
bit of effort. Their alternative was to either manually maintain the
list of symbols or write a tool to generate it.

Thank you for the info. Does the ARM proprietary linker support *? Is there any way to escape *? I think * is not usually used in a symbol name, so treating * as a metacharacter should be fine, but I’m wondering how the ARM linker works.

We went down the approach of a fast path when there where no wildcards
and a slow path when there was. Aside from the check that there was a
wildcard it wasn’t too much extra overhead for the fast path.

That’s true. The usual use case won’t be penalized by adding a wildcard support.

>
> Hi,
>
> I got a feature request from an internal customer of lld, but I don't know whether we should implement it or not, so I'd like to get opinions from people on this mailing list.
>
> The feature request is to allow wildcard patterns in the `--undefined` option. `--undefined foo` (or `-u foo` for short) makes the linker to pull out an object file from a static library if the file defines symbol foo. So, by allowing wildcard patterns, you can pull out all object files defining some JNI symbols (which start with "Java_") from static archives by specifying `-u "Java_*"`, for example.
>
> This seems mildly useful to me, but it comes with a cost. Currently, `-u` is literally as fast as a single hash lookup. If you allow wildcard patterns in `-u`, you have to attempt a wildcard pattern match against all symbols in the symbol table, which can be expensive.
>
> I'm also not sure how useful it will actually be. The above JNI case is somewhat convincing, but that's just one use case, and if there's only one use case, adding a new feature for that particular case is probably not a very good idea.
>
> Does anyone have any opinion on whether we should support this or not?
>

Arm's proprietary linker supported a similar wildcard feature for the
various section and symbol commands. My opinion was that it was very
useful for a small number of cases. Usually where a customer's set of
symbols to operate on was changing frequently during development and
with a simple naming convention and a wildcard they could save quite a
bit of effort. Their alternative was to either manually maintain the
list of symbols or write a tool to generate it.

Thank you for the info. Does the ARM proprietary linker support `*`? Is there any way to escape `*`? I think `*` is not usually used in a symbol name, so treating `*` as a metacharacter should be fine, but I'm wondering how the ARM linker works.

No it didn't have an escape character. The way the '*' was defined
(match 0 or more characters) would have matched a symbol name with a *
in it, but would also have matched more than it should have if the
intention was to just match the symbol with the '*' character. There
was an another wildcard '?' that matched any one character which could
have been used to try and match the '*' character a bit more
precisely. As you say '*' and '?' are not common in symbol names, I
don't think I've ever seen it happen in practice so the lack of
escaping didn't turn out to be a problem.

I created a patch for review to allow wildcard patterns in -u.

https://reviews.llvm.org/D63244