Hi peeps,
I’ve got a function which returns a float array, and this seems to cause a problem on arm64 when llc runs on it -O0, but works when run -O1. Array sizes up to and including 8 are good, larger, not good. Vectors larger that 8 are good too.
The code works correctly with both on x64. I’ve tried on osx/arm64 and linux/arm64 and get the same behaviour, and osx/x64 and linux/x64 are working correctly. Reducing the code down to the simplest example that fails is this nugget:
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
target triple = "arm64-apple-macosx11.0.0"
; Function Attrs: noinline norecurse nounwind optnone ssp uwtable
define i32 @main() #0 {
%1 = alloca i32, align 4
%2 = call [9 x float] @returnArray()
store i32 0, i32* %1, align 4
ret i32 4
}
; Function Attrs: argmemonly nounwind
define private [9 x float] @returnArray() #0 {
ret [9 x float] zeroinitializer
}
Here’s a comparison of the different optimisation levels:
jenkins@server failing % llc test.failing.ll -O1
jenkins@server failing % clang test.failing.s
jenkins@server failing % ./a.out
jenkins@server failing % echo $?
4
jenkins@server failing % llc test.failing.ll -O0
jenkins@server failing % clang test.failing.s
jenkins@server failing % ./a.out
zsh: segmentation fault ./a.out
jenkins@server failing %
The llc build is a recent 13 build:
jenkins@server failing % llc --version
LLVM (http://llvm.org/):
LLVM version 13.0.1
Optimized build with assertions.
Default target: arm64-apple-darwin20.3.0
Host CPU: cyclone
Registered Targets:
aarch64 - AArch64 (little endian)
aarch64_32 - AArch64 (little endian ILP32)
aarch64_be - AArch64 (big endian)
arm - ARM
arm64 - ARM64 (little endian)
arm64_32 - ARM64 (little endian ILP32)
armeb - ARM (big endian)
thumb - Thumb
thumbeb - Thumb (big endian)
wasm32 - WebAssembly 32-bit
wasm64 - WebAssembly 64-bit
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
jenkins@server failing %
So the question is, what am I doing wrong?
Any help appreciated!
Cesare