Aarch64: unaligned access despite -mstrict-align

Hi,

I experienced a crash in code compiled with Clang 10.0.0 due to a
misaligned 64-bit data access. The (ARMv8) CPU is configured with SCTL.A
== 1 (alignment check enable). With SCTLR.A == 0 the code runs as expected.

After some investigation I came up with the following reproducer:

---8<-------8<-------8<-------8<-------8<-------8<-------8<-------
$ cat test.c
extern char *g;
int memcmp(const void *s1, const void *s2, unsigned long n);

int f(void *c)
{
  return memcmp(g, c, 16);
}
$ clang --target=aarch64-linux-gnu -Os -mstrict-align -S test.c
$ cat test.s
  .text
  .file "test.c"
  .globl f // -- Begin function f
  .p2align 2
  .type f,@function
f: // @f
// %bb.0:
  adrp x8, g
  ldr x10, [x8, :lo12:g]
  ldr x9, [x0]
  ldr x8, [x10]
  rev x9, x9
  rev x8, x8
  cmp x8, x9
  b.ne .LBB0_3
// %bb.1:
  ldr x8, [x10, #8]
  ldr x9, [x0, #8]
  rev x8, x8
  rev x9, x9
  cmp x8, x9
  b.ne .LBB0_3
// %bb.2:
  mov w0, wzr
  ret
.LBB0_3:
  cmp x8, x9
  mov w8, #-1
  cneg w0, w8, hs
  ret
.Lfunc_end0:
  .size f, .Lfunc_end0-f
                                        // -- End function
  .ident "clang version 10.0.0-4ubuntu1 "
  .section ".note.GNU-stack","",@progbits
  .addrsig
---8<-------8<-------8<-------8<-------8<-------8<-------8<-------

Note the 'ldr x9, [x0]'. At this point there is no guarantee that x0 is
a multiple of 8, so why is Clang generating this code?

Thanks,

Hi Jerome,

Note the ‘ldr x9, [x0]’. At this point there is no guarantee that x0 is a multiple of 8, so why is Clang generating this code?

I think the point is that it can assume it is 8 byte aligned, so the question is why it isn’t. I guess that requires looking into how memory is allocated to what C is pointing to, or if some type punning caused passing an address that is not properly aligned.

Cheers,
Sjoerd.

Is this of any relevance?

https://bugs.llvm.org/show_bug.cgi?id=44246

  - Chuck

Sorry, quick message to ignore what I wrote before, I got myself confused (probably you too),
With a recent trunk build I get this:

f:

adrp x8, g

ldr x8, [x8, :lo12:g]

mov w2, #16

mov x1, x0

mov x0, x8

b memcmp

This looks more correct, and I need to look a bit more into this (and how clang 10.0.0 behaves).

Sorry, quick message to ignore what I wrote before, I got myself confused (probably you too),

:slight_smile:

With a recent trunk build I get this:

f:
        adrp x8, g
        ldr x8, [x8, :lo12:g]
        mov w2, #16
        mov x1, x0
        mov x0, x8
        b memcmp

This looks more correct, and I need to look a bit more into this (and how clang 10.0.0 behaves).

Indeed I would be quite happy with that :wink: Good to know that master
generated this. If you would you like me to bisect and identify the
commit that changed the behavior, please let me know.

Thanks,

See https://reviews.llvm.org/D76113 and my followup https://reviews.llvm.org/D77599 . (I didn't really think about it at the time, but maybe worth nominating for 10.0.1.)

-Eli

OK, thanks for clarifying.