Hi All,
Thank you for the feedback so far.
I am replying to all your questions/concerns/suggestions in this single email. Please let me know if I have missed any.
I will update the RFC accordingly to what we end up deciding here.
Kind regards,
Francesco
# TOPIC 1: concerns about name mangling
I understand that there are concerns in using the mangling scheme I proposed, and that it would be preferred to have a mangling scheme that is based on (and standardized by) OpenMP. I hear the argument on having some common ground here. In fact, there is already common ground between the x86 and aarch64 backend, who have based their respective Vector Function ABI specifications on OpenMP.
In fact, the mangled name grammar can be summarized as follows:
_ZGV<isa><masking><VLEN><parameter type>_<scalar name>
Across vector extensions the only <token> that will differ is the <isa> token.
This might lead people to think that we could drop the _ZGV<isa> prefix and consider the <masking><VLEN><parameter type>_<scalar name> part as a sort of unofficial OpenMP mangling scheme: in fact, the signature of an “unmasked 2-lane vector vector of `sin`” will always be `<2 x double>(2 x double>).
The problem with this choice is the number of vector version available for a target is not unique.
In particular, the following declaration generates multiple vector versions, depending on the target:
#pragma omp declare simd simdlen(2) notinbranch
double foo(double) {…};
On x86, this generates at least 4 symbols (one for SSE, one for AVX, one for AVX2, and one for AVX512: Compiler Explorer)
On aarch64, the same declaration generates a unique symbol, as specified in the Vector Function ABI.
This means that the attribute (or metadata) that carries the information on the available vector version needs to deal also with things that are not usually visible at IR level, but that might still need to be provided to be able to decide which particular instruction set/ vector extension needs to be targeted.
I used an example based on `declare simd` instead of `declare variant` because the attribute/metadata needed for `declare variant` is a modification of the one needed for `declare simd`, which has already been agreed in a previous RFC proposed by Intel [1], and for which Intel has already provided an implementation [2]. The changes proposed in this RFC are fully compatible with the work that is being don for the VecClone pass in [2].
[1] http://lists.llvm.org/pipermail/cfe-dev/2016-March/047732.html
[2] VecCLone pass: https://reviews.llvm.org/D22792
The good news is that as far as AArch64 and x86 are concerned, the only thing that will differ in the mangled name is the “<isa>” token. As far as I can tell, the mangling scheme of the rest of the vector name is the same, therefore a lot of infrastructure in terms of mangling and demangling can be reused. In fact, the `mangleVectorParameters` function in https://clang.llvm.org/doxygen/CGOpenMPRuntime_8cpp_source.html#l09918 could already be shared among x86 and aarch64.
TOPIC 2: metadata vs attribute
From a functionality point of view, I don’t care whether we use metadata or attributes. The VecClone pass mentioned in TOPIC 1 uses the following:
attributes #0 = { nounwind uwtable “vector-variants"="_ZGVbM4vv_vec_sum,_ZGVbN4vv_vec_sum,_ZGVcM8vv_vec_sum,_ZGVcN8vv_vec_sum,_ZGVdM8vv_vec_sum,_ZGVdN8vv_vec_sum,_ZGVeM16vv_vec_sum,_ZGVeN16”}
This is an attribute (I though it was metadata?), I am happy to reword the RFC using the right terminology (sorry for messing this up).
Also, @Renato expressed concern that metadata might be dropped by optimization passes - would using attributes prevent that?
TOPIC 3: "There is no way to notify the backend how conformant the SIMD versions are.”
@Shawn, I am afraid I don’t understand what you mean by “conformant” here. Can you elaborate with an example?
TOPIC 3: interaction of the `omp declare variant` with `clang declare variant`
I believe this is described in the `Option behavior, and interaction with OpenMP`. The option `-fclang-declare-variant` is there to make the OpenMP based one orthogonal. Of course, we might decide to make -fclang-declare-variant on/off by default, and have default behavior when interacting with -fopenmp-simd. For the sake of compatibility with other compilers, we might need to require -fno-clang-declare-variant when targeting -fopenmp-[simd].
TOPIC 3: "there are no special arguments / flags / status regs that are used / changed in the vector version that the compiler will have to "just know”
I believe that this concern is raised by the problem of handling FP exceptions? If that’s the case, the compiler is not allowed to do any assumption on the vector function about that, and treat it with the same knowledge of any other function, depending on the visibility it has in the compilation unit. @Renato, does this answer your question?
TOPIC 4: attribute in function declaration vs attribute function call site
We discussed this in the previous version of the proposal. Having it in the call sites guarantees that incompatible vector version are used when merging modules compiled for different targets. I don’t have a use case for this, if I remember correctly this was asked by @Hideki Saito. Hideki, any comment on this?
TOPIC 5: overriding system header (the discussion on #pragma omp/clang/system variants initiated by @Hal Finkel).
I though that the split among #pragma clang declare variant and #pragma omp declare variant was already providing the orthogonality between system header and user header. Meaning that a user should always prefer the omp version (for portability to other compilers) instead of the #pragma clang one, which would be relegated to system headers and headers provided by the compiler. Am I missing something? If so, I am happy to add a “system” version of the directive, as it would be quite easy to do given most of the parsing infrastructure will be shared.