MCJIT when lowering to x86-64 generates a MOVAPS (Move Aligned Packed Single-Precision Floating-Point Values) on a non-aligned memory address:
movaps 88(%rdx), %xmm0
where %rdx comes in as a function argument with only natural alignment (float*). This x86 instruction requires the memory address to be 16 byte aligned which 88 plus something aligned to 4 byte isn't.
Here the according IR code which was produced from the SLP vectorizer:
define void @func(float* noalias %arg0, float* noalias %arg1, float* noalias %arg2) {
entrypoint:
...
%104 = getelementptr float* %arg0, i32 22
...
%204 = bitcast float* %104 to <4 x float>*
store <4 x float> %198, <4 x float>* %204
This in itself not wrong. However, shouldn't the lowering pass recognize the wrong alignment?
I am using LLVM 3.4.2 as available as source code from llvm.org.
The LLVM IR is wrong. Omitting the align directive on the store means abi alignment of the target. The backend is “right” wrt to LLVM IR semantics to produce the movaps.
The error is in the producer (looks like the SLP vectorizer) of said vector store. Could you provide a full test case where running the SLP vectorizer (opt -slp-vectorize < t.ll) produces such an output?
The following code in the SLP vectorizer should have made sure that we created an alignment of “4 bytes” given a data layout (http://llvm.org/docs/LangRef.html#data-layout) that specifies f32:32:32.
case Instruction::Store: {
StoreInst *SI = cast<StoreInst>(VL0);
unsigned Alignment = SI->getAlignment();
...
StoreInst *S = Builder.CreateStore(VecValue, VecPtr);
if (!Alignment)
Alignment = DL->getABITypeAlignment(SI->getPointerOperand()->getType()); // << Get the 4byte alignment for the scalar float store from the data layout string.
S->setAlignment(Alignment);
Your .ll file does not have a data layout. Opt will not initialize the DataLayoutPass. The SLP vectorizer will not vectorize because there is no DataLayoutPass.
debug-cmake/bin/opt -default-data-layout="e-m:e-i64:64-f80:128-n8:16:32:64-S128" -basicaa -slp-vectorizer -S </Users/arnold/Downloads/module_H7ktW0.ll | grep "<4 x" | grep store
store <4 x float> %198, <4 x float>* %204, align 8
There is a bug in the SLPVectorizer however - it should be “align 4” - we get the alignment of the pointer type which is not what we want we want the alignment of the stored/loaded value. It should be
if (!Alignment)
Alignment = DL->getABITypeAlignment(SI->getValueOperand()->getType());
I am not sure that would fix your issue though, because that would mean we return the wrong alignment not none.
If the call below returns 0 then something has gone wrong in setting up the data layout in your compilation pipeline.
Your .ll file does not have a data layout. Opt will not initialize the DataLayoutPass. The SLP vectorizer will not vectorize because there is no DataLayoutPass.
debug-cmake/bin/opt -default-data-layout="e-m:e-i64:64-f80:128-n8:16:32:64-S128" -basicaa -slp-vectorizer -S </Users/arnold/Downloads/module_H7ktW0.ll | grep "<4 x" | grep store
store <4 x float> %198, <4 x float>* %204, align 8
There is a bug in the SLPVectorizer however - it should be “align 4” - we get the alignment of the pointer type which is not what we want we want the alignment of the stored/loaded value. It should be
if (!Alignment)
Alignment = DL->getABITypeAlignment(SI->getValueOperand()->getType());