How exactly is datatype alignment determined?

Hi,

I'm seeing a bug in the AVR backend that seems to be caused by LLVM thinking things will be aligned to 8 bytes whereas they are unaligned. Specifically, MF->getDataLayout().getPrefTypeAlignment(Ty) returns 8 for the following two types:

%opt = type { i8, [0 x i8], [3 x i8] }
%Machine = type { i16, [0 x i8], i16, [0 x i8], [16 x i8], [0 x i8] }

The target datalayout specifies that pointers are aligned to 8 bits (i.e. unaligned), so I would expect getPrefTypeAlignment to return 1:

target datalayout = "e-S8:p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"

So where does that datatype alignment result of 8 come from?

Thanks,
   Gergo

Probably from LargeArrayMinWidth/LargeArrayAlign settings in Targets.cpp (in clang).

-Krzysztof

Wait what? In clang? But my input is already LLVM IR. MF->getDataLayout().getPrefTypeAlignment(Ty) must be basing its answer on either something in the IR file, or in the target implementation, but clang is not really in the picture.

Actually, tracking down the sequence of function calls, it turns out that '8' is ultimately coming from the following call in DataLayout::getAlignment:

getAlignmentInfo(AGGREGATE_ALIGN, 0, abi_or_pref, Ty);

this seems to return 8 with the following datalayout string:

e-S8:p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8

I think my problem is that 'a:8' probably doesn't mean what I think it should mean. What is the difference between 'a:8' and 'a:0'?

OK I just now tried with 'a:0' for identical result. I am now quite sure I am fully confused about what these alignment settings mean.

Does 'a:8' mean single byte alignment (i.e. *no* alignment)? If yes, does that mean getAlignment for a struct type should return 1? In fact, when getAlignment returns 8, does that mean 8 bits (no alignment) or 8 bytes (64 bits) alignment? Or does that mean the lowest 8 bits of the address needs to be 0, i.e. 256-byte alignment?

Would http://llvm.org/docs/LangRef.html#langref-datalayout be help?

  • chenwj

The 8 in the data layout string should have been converted to a byte value by this code before it was passed to setAlignment. As far as I cant ell getAlignment should return the byte alignment that was passed to setAlignment, not the bit alignment from the string.

// ABI alignment.
if (Rest.empty())
report_fatal_error(
“Missing alignment specification in datalayout string”);
Split = split(Rest, ‘:’);
unsigned ABIAlign = inBytes(getInt(Tok));

Thanks, yes, I can see that the '8' is converted to the byte alignment value '1', which is correct.
Now I've added lots of debug printing to IR/DataLayout.cpp, and it seems that 'a:8' or 'a:8:8' is correctly processed from the IR file to set aggregate alignment to the value '1'. However, I see that there's also someone else calling setAlignment, *after* the .ll file's datalayout specification is applied. And that someone else is explicitly setting the Aggregate layout to be 8-byte aligned :open_mouth:

So now I guess I'll have to track down who else is overriding the alignment settings.

aaaargh so apparently the TargetMachine subclass passes a datalayout string to the LLVMTargetMachine base class's constructor, and that datalayout string takes precedence... and in the AVR case, that datalayout string is wrong.

Thanks, I think I'll be able to handle it from here.