hi,
When I test some cases of some arguments defined with union type base on AArch64 backend, I don’t understand why passing a value in C case will be transformed into passing a pointer in final assemble output, such as the following type __m256 , Compiler Explorer
typedef union {
fvec32 vect_f32;
} __m256, __m256d;
It’s even more strange in the IR dump for @_mm256_mul_ps1, there is extra argument ptr noalias sret(%union.__m256) align 16 %agg.result before the argument %a and %b .
__m256 _mm256_mul_ps1(__m256 a, __m256 b)
{
__m256 res;
res.vect_f32 = svmul_f32_z(svptrue_b32(), a.vect_f32, b.vect_f32);
return res;
}
Not particularly familiar with the AArch64 ABI, but I believe the relevant rule is AAPCS64 6.8.2 B.4:
If the argument type is a Composite Type that is larger than 16 bytes, then the argument is
copied to memory allocated by the caller and the argument is replaced by a pointer to the
copy.
Where a union is a “Composite Type”. I believe without the union this would be a “Pure Scalable Type” instead, the definition for which explicitly says (5.10):
Pure Scalable Types are never unions and never contain unions.
sret is for indirectly passed return values: The caller will allocate the return value slot on the stack and pass an sret pointer to the callee.