Default alignment for 'malloc'

I am trying to implement some new alignment based optimisations in our target backend, and I am wondering if there a way a target can specify that ‘malloc’, ‘realloc’ and ‘calloc’ always return a pointer to memory that is aligned to a particular boundary?

Related too, is it possible to specify that the stack pointer always points to memory which is aligned to a particular boundary?

Thanks,

MartinO

I am trying to implement some new alignment based optimisations in our
target backend, and I am wondering if there a way a target can specify that
‘malloc’, ‘realloc’ and ‘calloc’ always return a pointer to memory that is
aligned to a particular boundary?

malloc is guaranteed to be properly aligned for any C type. This would
be 8 bytes on most systems for double. However, I think in practice
most modern implementations return 16-byte aligned pointers. I don't
think there is a way to annotate calls malloc to have some specific
alignment from the backend, that has effect on passes before the
backend.

Related too, is it possible to specify that the stack pointer always points
to memory which is aligned to a particular boundary?

For eg. 16-byte alignment, declare with:
__attribute__((aligned(16)))

Michael

Thanks Michael,

When vectorising loads and stores, it is very useful to know the actual alignment, especially when the memory architecture is a bit exotic. Since we can issue two simultaneous load/store instructions, I have a stronger requirement for this information than would usually be the case, and while our 'malloc' does ensure optimal alignment for all data types, that information is not attached to the IR so the compiler seems no difference between 'void *malloc(size_t)' and 'void *foo(size_t)'. I have started experimenting with '__attribute__((alloc_align(N)))', but this requires altering the Standard headers. And of course I can edit the target independent source code for the compiler to explicitly add this information to 'malloc' et all. But I was hoping that there was an existing mechanism such as a 'TargetTransformInfo' callback.

Regarding the stack pointer - I am referring to state of the SP as it is maintained by the compiler. While I can actually ensure that it is always aligned (part of the ABI) in the prologue/epilogue, the IR doesn't possess this information, so the alignment optimisations have to specially check if the pointer being processed is the SP or not.

All the best,

  MartinO

Note that this only applies to base types. Vector types certainly can
require larger alignment in practice and that's why posix_memalign
exists.

Joerg

2016-10-03 13:55 GMT+02:00 Martin J. O’Riordan via llvm-dev
<llvm-dev@lists.llvm.org>:

I am trying to implement some new alignment based optimisations in our
target backend, and I am wondering if there a way a target can specify that
‘malloc’, ‘realloc’ and ‘calloc’ always return a pointer to memory that is
aligned to a particular boundary?

malloc is guaranteed to be properly aligned for any C type. This would
be 8 bytes on most systems for double. However, I think in practice
most modern implementations return 16-byte aligned pointers. I don’t
think there is a way to annotate calls malloc to have some specific
alignment from the backend, that has effect on passes before the
backend.

Note that this only applies to base types. Vector types certainly can
require larger alignment in practice and that’s why posix_memalign
exists.

Is memalign stil needed for vector types? From the man page (http://www.manpagez.com/man/3/calloc/), it says:

The malloc(), calloc(), valloc(), realloc(), and reallocf() functions allocate memory.

The allocated memory is aligned such that it can be used for any data type, including AltiVec- and SSE-related types.

Given this, a while ago I had a hacked up an LLVM pass to add align(16) to the return attributes of malloc. It saved a few KB from the clang executable size due to more efficient memcpy and related functions being generated in the backend.

So yes please to some kind of solution to this. I’d be fine with adding something to TTI if we can’t get it in the headers.

Cheers,
Pete

If you want to be portable to more than one libc implementation, yes.
Base alignment on i386 is still 32bit only for example in the SysV ABI.
I'm not even sure what the requirements for AVX is, e.g. if they finally
bumped it to 256bit for some things.

Joerg

You are right and I should have remembered: BlueGene/Q has 32 byte
vector types, and I was using posix_memalign for those. IBM certainly
would not change the system base alignment of PowerPC because of this.

posix_memalign would still have another use case: Alignment to cache
line boundaries.

Michael