Hi,
While working on alias support for the LLVM-ML project, I ran into a feature
implemented back in 2010: default-null weak externals in COFF, a GNU
extension.
rG17990d56907b
I'd like to disable this feature when targeting MSVC compatibility. Does
anyone have more context on this, and why it'd be a terrible idea?
For context: This seems to be designed to let LLVM implement a GNU extension
in COFF libraries. However, it leads to very different behavior than we see
for cl.exe (and ml.exe) on Windows; for already-defined aliasees, it injects
an alternate placeholder ".weak.<alias>.default.<uniquifier>" symbol which
resolves back to the current location. I admit, I'm not quite sure how this
helps. If anyone can explain the purpose, I'd really appreciate it!
So, for the GNU extension, from the user point of view, there's two potential usecases.
A translation unit can reference a function declaration with __attribute__((weak)), with no implementation in the translation unit. This then then either evaluates to NULL or an actual implementation, if there existed another, non-weak definition in another object file at link time.
Secondly, multiple translation units may have function definitions that are marked with the weak attribute. You can have this in 0-N object files, and 0-1 object files containing a non-weak definition. If there's no non-weak definition, one of the weak definitions ends up picked, but if there is one, the non-weak one ends up used.
As all this is consumed via GNU style attributes (in MinGW environments), it shouldn't really matter in an MSVC context.
I recently worked on this to get the final details on this hooked up for COFF, so I'd be happy to have a look at any work touching this feature.
In Windows PE/COFF files, aliases typically just resolve to their target
symbol. For an example, see ⚙ D87403 [ms] [llvm-ml] Add support for "alias" directive.
For the cases where there already exists a symbol with a name that is unique in itself, just adding an alias directly to the target symbol sounds sensible in itself, but for cases when it isn't set up as an alias, but where the implementation itself is marked weak, the uniquifying symbol name is needed, to allow multiple objects to provide the same thing.
Consider these two examples in GAS assembly form:
.globl uniquename
uniquename:
ret
.globl func
func:
ret
.weak aliasname
aliasname = func
This produces the following symbols, shown with llvm-objdump -t:
[ 6](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 uniquename
[ 7](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000001 func
[ 8](sec 0)(fl 0x00)(ty 0)(scl 69) (nx 1) 0x00000000 aliasname
AUX indx 10 srch 3 [pointing at .weak.aliasname.default.uniquename]
[10](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000001 .weak.aliasname.default.uniquename
So here .weak.aliasname.default.uniquename is identical to func, and as func itself is non-weak, aliasname could just as well have pointed directly at func instead.
But for this case, the extra dance is necessary:
.globl uniquename
uniquename:
ret
.weak func
.globl func
func:
ret
Producing:
[ 6](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 uniquename
[ 7](sec 0)(fl 0x00)(ty 0)(scl 69) (nx 1) 0x00000000 func
AUX indx 9 srch 3
[ 9](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000001 .weak.func.default.uniquename
Initially, the non-weak symbols were just named ".weak.func.default", but this caused clashes if multiple object files defined the same one. I tried fixing this in ⚙ D71711 [COFF] Make the autogenerated .weak.<name>.default symbols static by making the non-weak symbols that the weak ones point at static, but MSVC tools error out if you have a weak symbol pointing at a non-external symbol (as "weak" in COFF actually is "weak external"). Therefore I reverted that attempt and I later made ⚙ D75989 [COFF] Assign unique names to autogenerated .weak.<name>.default symbols that tries to make unique names for these symbols, to avoid clashes.
// Martin