crash JIT with AVX intrinsics

Hi all,

I have some old code using the JIT via the LLVM-3.0-C API. I want to upgrade to newer versions of LLVM. As a simple example I wrote a C program that creates the following function and calls it:

; ModuleID = 'round-avx.bc'
target triple = "x86_64-pc-linux-gnu"

define void @round(<8 x float>*) {
_L1:
   %1 = load <8 x float>* %0
   %2 = call <8 x float> @llvm.x86.avx.round.ps.256(<8 x float> %1, i32 1)
   store <8 x float> %2, <8 x float>* %0
   ret void
}

; Function Attrs: nounwind readnone
declare <8 x float> @llvm.x86.avx.round.ps.256(<8 x float>, i32) #0

attributes #0 = { nounwind readnone }

I attach the main C file and have collected all required files in this archive:
    http://code.haskell.org/~thielema/llvm-tf/avx-instruction-selection.tar.gz

My problem is that the simple C program crashes on newer LLVM versions. The preprocessor conditionals show my attempts to replace the AVX 8-float vector by a SSE 4-float vector or by a scalar and to replace the rounding by a simple addition.

$ make avx-instruction-selection-3.4 avx-instruction-selection-3.5 avx-instruction-selection-3.6 avx-instruction-selection-3.7 avx-instruction-selection-3.8
g++ -Wall -o create-execution-engine-3.4.o -c create-execution-engine.cpp `llvm-config-3.4 --cxxflags` -I.
gcc -Wall -o avx-instruction-selection-3.4 avx-instruction-selection.c `llvm-config-3.4 --cflags --ldflags` -I. create-execution-engine-3.4.o -lLLVM-3.4
g++ -Wall -o create-execution-engine-3.5.o -c create-execution-engine.cpp `llvm-config-3.5 --cxxflags` -I.
gcc -Wall -o avx-instruction-selection-3.5 avx-instruction-selection.c `llvm-config-3.5 --cflags --ldflags` -I. create-execution-engine-3.5.o -lLLVM-3.5
g++ -Wall -o create-execution-engine-3.6.o -c create-execution-engine.cpp `llvm-config-3.6 --cxxflags` -I.
gcc -Wall -o avx-instruction-selection-3.6 avx-instruction-selection.c `llvm-config-3.6 --cflags --ldflags` -I. create-execution-engine-3.6.o -lLLVM-3.6
g++ -Wall -o create-execution-engine-3.7.o -c create-execution-engine.cpp `llvm-config-3.7 --cxxflags` -I.
gcc -Wall -o avx-instruction-selection-3.7 avx-instruction-selection.c `llvm-config-3.7 --cflags --ldflags` -I. create-execution-engine-3.7.o -lLLVM-3.7
g++ -Wall -o create-execution-engine-3.8.o -c create-execution-engine.cpp `llvm-config-3.8 --cxxflags` -I.
gcc -Wall -o avx-instruction-selection-3.8 avx-instruction-selection.c `llvm-config-3.8 --cflags --ldflags` -I. create-execution-engine-3.8.o -lLLVM-3.8
rm create-execution-engine-3.5.o create-execution-engine-3.8.o create-execution-engine-3.7.o create-execution-engine-3.6.o

$ avx-instruction-selection-3.4
7ffe413ec6e0, size 20, align 20

$ avx-instruction-selection-3.5
7fffbf385980, size 20, align 20

$ avx-instruction-selection-3.6
7ffc7fadc220, size 20, align 20
segmentation fault

$ avx-instruction-selection-3.7
7ffcb60c9960, size 20, align 20
segmentation fault

$ avx-instruction-selection-3.8
7ffef88c3500, size 20, align 20
segmentation fault

That is, it works only for LLVM-3.4 and LLVM-3.5. However I can make the two versions crash, too, by adding setUseMCJIT(true) to the engineBuilder in LLVMCreateExecutionEngineForModuleCPU.

Do you have any idea what might went wrong? How could I investigate the crash? Are there equally simple examples that show how to use the current LLVM API correctly?

Btw. my problem is a follow-up of this StackOverflow question: Determining and setting host target triple and instruction extensions in LLVM-C API - Stack Overflow

avx-instruction-selection.c (3.43 KB)

Hi Henning,

My problem is that the simple C program crashes on newer LLVM versions. The
preprocessor conditionals show my attempts to replace the AVX 8-float vector
by a SSE 4-float vector or by a scalar and to replace the rounding by a
simple addition.

For some reason, even though you're getting an ExecutionEngine there's
actually been an error: "Interpreter has not been linked in.". I've
seen this one before, and (for reasons I don't fully understand) you
need to include "llvm/ExecutionEngine/MCJIT.h" in your .cpp file to
fix it.

It still doesn't run because LLVMRunFunction can't handle calling
functions with arbitrary prototypes from C or C++. It can handle the
common "main" prototypes, but not much more. Some workarounds are:

  + Pass in some context to a main-like function.
  + Compile a special main-like function that's been told the
addresses you're using:
       define void @main() {
          call void @round(<8 x float>* inttoptr(i64 $MAGIC_NUMBER to
<8 x float>*))
          ret void
       }
  + Get a raw pointer to the function, cast it to the correct C type
yourself and call it (this assumes you're not interested in remote
JIT).

Cheers.

Tim.

Hi Tim,

I experimented for days now without success and your answer eventually solved the problem! I still have some questions.

For some reason, even though you're getting an ExecutionEngine there's
actually been an error: "Interpreter has not been linked in.".

I have added a check for the result of LLVMCreateExecutionEngineForModule and it returned successfully.

I've seen this one before, and (for reasons I don't fully understand) you need to include "llvm/ExecutionEngine/MCJIT.h" in your .cpp file to fix it.

I have seen people calling LinkInMCJIT() in their code. Would that be equivalent? In ExecutionEngine/MCJIT.h I read:

       // We must reference MCJIT in such a way that compilers will not
       // delete it all as dead code, even with whole program optimization,

Anyway, neither including ExecutionEngine/MCJIT.h nor calling LinkInMCJIT affects whether my program crashes or not.

It still doesn't run because LLVMRunFunction can't handle calling
functions with arbitrary prototypes from C or C++.

I used LLVMRunFunction only for the example. What you say sounds like a regression from LLVM-3.5 where LLVMRunFunction worked with that prototype. Was this an intended regression and is it documented somewhere? If yes, then people should be warned with a deprecation pragma and I guess LLVMRunFunctionAsMain should be the prefered function then?

+ Get a raw pointer to the function, cast it to the correct C type yourself and call it (this assumes you're not interested in remote JIT).

I use LLVMGetPointerToGlobal now and this solves all problems!

Thanks a lot!
Henning