Is this a bug? "+multivalue" attribute contaminates non-multivalue functions

Hi everyone, I’m trying to chase down some test failures I got in the Rust compiler tests, and I was hoping to pick your brains about this: Would you consider this a bug in LLVM?

Try adding and removing the #1 attribute from the definition of @irrelevant() on line 9, and watch what happens to the .functype of thinger() on line 3 of the output. The signature changes despite not having the attribute applied to it! "target-features"="+multivalue" seems to apply to either the whole module, or none of it.

This is problematic for linking object files together, as the signature of a function in a module depends on the definitions of unrelated functions in that module. This is especially troublesome if your compiler splits your code into different modules in unpredictable ways, causing spurious linking problems. It also means that a non-multivalue function can’t be linked into a multivalue binary reliably.

It seems strange to me that an attribute affects functions that it isn’t applied to like this. It breaks the assumptions I had about how LLVM IR attributes work. Is the bug in my assumptions, or in LLVM?

For the wasm backend I would say that we generally intend target-features to apply at the module level; in most cases we use the same IR (and we test using LLC flags that apply to the whole module) . This is mainly because in wasm, features and validation are properties of the module rather than the function (in most cases necessarily so, as features may include non-function-local constructs such as types). So in that model, if any function uses a feature, it means that the module as a whole must use that feature, so at codegen time it’s safe to generate code with that feature for any function.
Multivalue is slightly special in that it also causes codegen to use a different ABI than it would otherwise (this causes the signature change that you see). I think I agree that especially in that case, having that nonlocal effect is probably a violation of how function-level LLVM IR attributes should work, and that it makes sense to want to link non-multivalue functions/modules into multivalue modules. I think that could maybe be solved by using a separate attribute for the ABI independent of the codegen. That way a declaration of an undefined function (to be linked in after codegen) could correctly describe how that function was compiled independently of how the current module is being generated.

It’s maybe a different question whether the current behavior (of being able to “upgrade” the feature set of a function when a module has functions with different attributes) is desirable. I’m not sure what the use case would be for preventing that? (or put another way, this would result in wasm module where the module and some of the functions would be generated to one feature set, and other functions would be generated to a different feature set, and I’m not sure what the use case for that would be).

Thanks for your input, that’s very useful.

Just so that you’re aware, this effects Rust’s experimental wasm ABI feature. I’ve left a comment over there explaining the mismatched interpretation:

The only reason I could contrive for this is to allow a statically linked library to have different functions (or different versions of the same function) targeting different feature sets. Although the Linking convention doesn’t support this— but it could probably be added if it was desired…?

Hm, yeah that does seem reasonable. In emscripten we handle that in the compiler/linker driver (i.e. the driver ensures that different versions of the static libraries are added to the underlying linker command line depending on flags that the user passes to the driver). There’s even a higher layer with more complex logic, e.g. emscripten allows the user to specify which version of which browsers they want to target and the driver figures out which features that implies (see e.g. here). But actually that could probably stay in emscripten even if we had some shared linker-level logic for determining which functions/objects to add to a link based on features. @tlively and/or @sbc100 might also have some thoughts on this.

We have thought about supporting function multiversioning (i.e. different versions of the same function for different feature sets) in the past, but we never followed through and added support for it. If we did add support for that in the future, multiversioned functions would certainly be exempt from the logic that takes the union of all the features enabled in a module and applies that union to all of the functions.