fields in structure re-arranged for alignment?

Hi Folks,

Bear with me, I’m a newbie to LLVM.

I’ve read the language reference and the mailing list archive. One area of the semantics of the Structure type that hasn’t been discussed is whether fields in the structure get re-arranged to better suit the target machine’s natural alignment, ala what happens in C.

For example would this structure on a 32-bit machine:

{ i16, i32, i16 }

get re-arranged to

{ i16, i16, i32 } or { i32, i16, i16 }

Does LLVM guarantee a consistent code generation in this case for all platforms, or is this considered target-specific and a decision left up to the back-end of the target platform?

Furthermore if re-arrangement does occur, how does this affect the semantics of GetElementPtr when it is used for pointer calculation of a Structure? For example if I had:

%BLAH = type { i16, i32, i16 }

...
%firstField = getelementptr %BLAH* %instance, i32 1, i32 0

In this example I want to access the first 16-bit integer field of Structure BLAH, so the last argument to GetElementPtr is index 0, but if the fields gets re-arranged, how do I know which index to use?

Thanks,
Kean

For example would this structure on a 32-bit machine:

{ i16, i32, i16 }

get re-arranged to

{ i16, i16, i32 } or { i32, i16, i16 }

If there will ever be a re-arrangement mechanism in the future,
then LLVM needs some way to encode the equivalent of the
packed/align attributes of GCC.

Sometimes it is OK to re-arrange, e.g. for things that are "just"
structs. Sometimes it's very bad to re-arrange, e.g. for
structures that have 1:1 representations in hardware, e.g. when
you communicate an USB URB or a WLAN packet.

Hi Kean,

One area of the semantics of the Structure type that hasn't been
discussed is whether fields in the structure get re-arranged to better
suit the target machine's natural alignment, ala what happens in C.

For example would this structure on a 32-bit machine:

{ i16, i32, i16 }

get re-arranged to

{ i16, i16, i32 } or { i32, i16, i16 }

(C doesn't re-order structure members; it guarantees not to.
Otherwise, various uses of structs wouldn't work.)

Cheers,

Ralph.

Unless the alignment or packing is overwritten at the llvm level,
structures are aligned at the natural alignment of the target machine.
If you have exacting layout requirements, you can use a packed
structure with alignment attributes.

It would be possible to rearrange llvm structs as an optimization, but
there are many considerations to do that safely (e.g. checking that
the rearranged memory is never used by external code). I have done
some struct reordering passes before, and the field fiddling is easy
enough to do, the hard part is making sure it is safe.

Andrew

Bear with me, I'm a newbie to LLVM.

Welcome,

I've read the language reference and the mailing list archive. One area of
the semantics of the Structure type that hasn't been discussed is whether
fields in the structure get re-arranged to better suit the target machine's
natural alignment, ala what happens in C.

For example would this structure on a 32-bit machine:
{ i16, i32, i16 }
get re-arranged to
{ i16, i16, i32 } or { i32, i16, i16 }

Ok.

Does LLVM guarantee a consistent code generation in this case for all
platforms, or is this considered target-specific and a decision left up to
the back-end of the target platform?

There are two issues: codegen of a specific GEP/type and optimization. The code generator currently does very trivial structure layout. If a struct is not "packed", it basically assigns fields sequentially consequtive addresses with internal and external padding to make alignment. If the struct is packed, it does the same thing, but inserts no padding.

The LLVM optimizer guarantees that it preserves the semantics of the input code. It would be valid for the optimizer to reorder fields, but only when it can tell that "noone will notice" (for example, the optimizer is not allowed to change the layout of memory-mapped io structs).

Furthermore if re-arrangement does occur, how does this affect the semantics
of GetElementPtr when it is used for pointer calculation of a Structure? For
example if I had:

%BLAH = type { i16, i32, i16 }
...
%firstField = getelementptr %BLAH* %instance, i32 1, i32 0

In this example I want to access the first 16-bit integer field of Structure
BLAH, so the last argument to GetElementPtr is index 0, but if the fields
gets re-arranged, how do I know which index to use?

If the optimizer rearranges a struct, it will rewrite all the GEPs to be consistent.

-Chris