crash JIT with AVX intrinsics

I have some old code using the JIT via the LLVM-3.0-C API. I want to upgrade to newer versions of LLVM. As a simple example I wrote a C program that creates the following function and calls it:

; ModuleID = 'round-avx.bc'
target triple = "x86_64-pc-linux-gnu"

define void @round(<8 x float>*) {
   %1 = load <8 x float>* %0
   %2 = call <8 x float><8 x float> %1, i32 1)
   store <8 x float> %2, <8 x float>* %0
   ret void

; Function Attrs: nounwind readnone
declare <8 x float><8 x float>, i32) #0

attributes #0 = { nounwind readnone }

I attach the main C file and have collected all required files in this archive:

My problem is that the simple C program crashes on newer LLVM versions. The preprocessor conditionals show my attempts to replace the AVX 8-float vector by a SSE 4-float vector or by a scalar and to replace the rounding by a simple addition.

$ make avx-instruction-selection-3.4 avx-instruction-selection-3.5 avx-instruction-selection-3.6 avx-instruction-selection-3.7 avx-instruction-selection-3.8
g++ -Wall -o create-execution-engine-3.4.o -c create-execution-engine.cpp `llvm-config-3.4 --cxxflags` -I.
gcc -Wall -o avx-instruction-selection-3.4 avx-instruction-selection.c `llvm-config-3.4 --cflags --ldflags` -I. create-execution-engine-3.4.o -lLLVM-3.4
g++ -Wall -o create-execution-engine-3.5.o -c create-execution-engine.cpp `llvm-config-3.5 --cxxflags` -I.
gcc -Wall -o avx-instruction-selection-3.5 avx-instruction-selection.c `llvm-config-3.5 --cflags --ldflags` -I. create-execution-engine-3.5.o -lLLVM-3.5
g++ -Wall -o create-execution-engine-3.6.o -c create-execution-engine.cpp `llvm-config-3.6 --cxxflags` -I.
gcc -Wall -o avx-instruction-selection-3.6 avx-instruction-selection.c `llvm-config-3.6 --cflags --ldflags` -I. create-execution-engine-3.6.o -lLLVM-3.6
g++ -Wall -o create-execution-engine-3.7.o -c create-execution-engine.cpp `llvm-config-3.7 --cxxflags` -I.
gcc -Wall -o avx-instruction-selection-3.7 avx-instruction-selection.c `llvm-config-3.7 --cflags --ldflags` -I. create-execution-engine-3.7.o -lLLVM-3.7
g++ -Wall -o create-execution-engine-3.8.o -c create-execution-engine.cpp `llvm-config-3.8 --cxxflags` -I.
gcc -Wall -o avx-instruction-selection-3.8 avx-instruction-selection.c `llvm-config-3.8 --cflags --ldflags` -I. create-execution-engine-3.8.o -lLLVM-3.8
rm create-execution-engine-3.5.o create-execution-engine-3.8.o create-execution-engine-3.7.o create-execution-engine-3.6.o

$ avx-instruction-selection-3.4
7ffe413ec6e0, size 20, align 20

$ avx-instruction-selection-3.5
7fffbf385980, size 20, align 20

$ avx-instruction-selection-3.6
7ffc7fadc220, size 20, align 20
segmentation fault

$ avx-instruction-selection-3.7
7ffcb60c9960, size 20, align 20
segmentation fault

$ avx-instruction-selection-3.8
7ffef88c3500, size 20, align 20
segmentation fault

That is, it works only for LLVM-3.4 and LLVM-3.5. However I can make the two versions crash, too, by adding setUseMCJIT(true) to the engineBuilder in LLVMCreateExecutionEngineForModuleCPU.

Do you have any idea what might went wrong? How could I investigate the crash? Are there equally simple examples that show how to use the current LLVM API correctly?

Btw. my problem is a follow-up of this StackOverflow question:

