I was recently updating the Android sources for Clang/LLVM and noticed that some ARM vector types were being altered by Clang (and passed differently than in prior releases of LLVM 3.x). I tracked this down to the following changelist (r166043):
Author: Manman Ren <firstname.lastname@example.org>
ARM ABI: passing illegal vector types as varargs.
We expand varargs in clang and the call site is handled in the back end, it is
hard to match exactly how illegal vectors are handled in the backend. Therefore,
we legalize the illegal vector types in clang:
if (Size <= 32), legalize to i32.
if (Size == 64), legalize to v2i32.
if (Size == 128), legalize to v4i32.
if (Size > 128), use indirect.
The most peculiar part of this ABI change (as it is definitely an ABI change from what was shipped in Clang/LLVM 3.x), is that some vec3 types are now being passed as indirect parameters, while the previous behavior was to pass the argument directly. I attached an example file that demonstrates the different behavior from Clang. If compiled for ARMv7, the <3 x i64> parameter will become a <3 x i64>* indirect parameter. Note that <4 x i64> will still be passed as the standard <4 x i64> (same for <2 x i64> and i64). Note that other vec3 types (like <3 x i32>) still keep their original shape with no translation to indirect parameter passing.
Considering that vec3 types are fairly commonly treated as vec4 types, I don’t see why <3 x i64> and <3 x double> should be passed differently than they were before (nor inconsistently with how vec4 works for the same primitive type). This change unfortunately breaks existing Android code that has been compiled with Clang, and I don’t think we are the only users of vector types on ARM that will have trouble with this.
The second major difference that I noticed (also in my sample file) was that that we are now losing semantic information about the incoming arguments to a function with a vector parameter that requires less than 32 bits of storage space (like <2 x i8>, <3 x i8>). In these cases, they are being explicitly coerced to a single i32 parameter, which is later used in the function body to extract the individual vector components. This ends up being disruptive to optimizers that want to analyze the actual input data shape. I also am not sure that this actually simplifies or improves the ARM backend.
I have experimented with removing both of these behaviors from the changelist and I can still generate working ARM code using this modified Clang. Is it possible to revert this ABI change and apply the illegal vector indirection possibly only on larger (non-power-of-2) vector types? I want to get some opinions from the rest of the community before I upload a potential patch, in case there is some other relevant information that I am missing here. I looked through cfe-dev and cfe-commits and didn’t see any comments related to the original change at all.
t.c (348 Bytes)