Over-aligned vst1.64 and vld1.64 for arm-linux-androideabi

I am seeing clang for arm-linux-androideabi emit 16-byte-aligned vst1.64 and vld1.64, but the ABI only guarantees 8-byte alignment. Result at runtime is SIGBUS. Since clang emits “#define BIGGEST_ALIGNMENT 8”, it is aware of the ABI’s maximum alignment.

clang -S shows no alignment clause in the address-register operand. I.e., simple [rN], not [rN:128], but disassembly shows the [rN:128]. What controls that default alignment boundary?


__ -Eli

Here is a testcase:

struct R {
void *v = nullptr;
R(R& rx) { v = rx.v; }
R() {}

struct S {
R r;
long long ll[2];
int i;

S() {}

extern void bar(S&);

void foo(S& sin) {
S s(sin);

Compile like so:
/opt/android_ndk/r15c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang++ -target armv7-none-linux-androideabi -std=c++11 test.cpp -S -o test.s

An excerpt from test.s:


vld1.64 {d16, d17}, [r2]!
vst1.64 {d16, d17}, [lr]!

The vxx1.64 insns assemble to forms with 128-bit alignment, so :128 is the default alignment clause.
The load/store insns would be OK in this form, with 64-bit alignment:

vld1.64 {d16, d17}, [r2:64]!
vst1.64 {d16, d17}, [lr:64]!

However, I don’t see how to alter that default.

Passing -fmax-type-align=8 to clang does nothing, though I expected it to drop :128 to :64 for the alignment clause on the address-register operand. -fmax-type-align=4 downgrades the vxx1.64 to vxx64.32.


That isn't how vld1.64 works.

There are three forms of vld1.64 which load two registers (encoding comments generated using "llvm-mc --arch=arm -mattr=+neon -show-encoding"):

     vld1\.64 \{d16, d17\}, \[r2\]\!       @ encoding: \[0xcd,0x0a,0x62,0xf4\]
     vld1\.64 \{d16, d17\}, \[r2:64\]\!    @ encoding: \[0xdd,0x0a,0x62,0xf4\]
     vld1\.64 \{d16, d17\}, \[r2:128\]\!   @ encoding: \[0xed,0x0a,0x62,0xf4\]

The first requires no alignment, the second requires 64-bit alignment, the third requires 128-bit alignment.


I see the first form (no alignment) from "clang -S ..."
When I compile to object via "clang -c ...", subsequent disassembly shows
the third form (128-bit alignment).
What's up with that?


I can’t reproduce that; compiling and disassembling your testcase with clang and arm-linux-androideabi-objdump from NDK version r15c, I get the expected disassembly without the alignment specifier. -Eli