JIT on Intel KNC


in the past few weeks we were able to confirm that the LLVM's JIT compiler can be used for our research project. This was confirmed for x86-64 architecture (with very good performance results by the way).

Now, one of our real target architecture is the Intel Xeon Phi (KNC) accelerator in a native execution model. When cross-compiling LLVM (3.4 RC1) for Xeon Phi with CMake following the Intel guidelines (for general cross-compilation), the compilation process stops with

[ 88%] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86ISelLowering.cpp.o
[ 88%] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86InstrInfo.cpp.o
[ 88%] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86JITInfo.cpp.o
/tmp/icpcI3Tb9Aas_.s: Assembler messages:
/tmp/icpcI3Tb9Aas_.s:48: Error: `movaps' is not supported on `k1om'
/tmp/icpcI3Tb9Aas_.s:49: Error: `movaps' is not supported on `k1om'
/tmp/icpcI3Tb9Aas_.s:50: Error: `movaps' is not supported on `k1om'
/tmp/icpcI3Tb9Aas_.s:51: Error: `movaps' is not supported on `k1om'

It seems LLVM JIT for KNC is new territory.Can the JIT be used on this architecture? Are there only small patches necessary to get this to work?



We have the same difficulties.

KNC (dubbed k1om in compiler utils) is a 64-bit device with 8087-compatible scalar arithmetics and non-standard vector arithmetics. More specifically, the widely used variant of 64-bit ABI implemented by LLVM involves xmm registers, while KNC does not have xmm-s and instead has zmm-s (512-bit wide). This makes standard 64-bit binaries you’re trying to compile partially incompatible with KNC. So, even if you’d succeed to compile them somehow, they may fail to run (illegal instruction).

There are two possible solutions I know of.

First one - use KNC in 32-bit mode. KNC can run any 32-bit binary, if 32-bit C runtime is provided and device Linux kernel is compiled with CONFIG_IA32_EMULATION=y. This will not allow to use KNC’s vector units, so it will be worth only for massive multi-threading. You do not loose much, because even if k1om-ready LLVM backend would exist, I doubt it could optimize for vector arithmetics well enough without man-year of testing/tuning the exiting vectorizer.

Second one - ask Intel to increase the priority of KNC support in LLVM. As far as I know, they do not plan official KNC support at all, only KNL (Landing, the next one). According to SC13, KNL will appear in 2015…

  • D.

PathScale has KNC llvm backend and also a non-intel runtime for managing the device.

We're missing some SLP vector tuning and a few other performance things, but that should improve in the next couple months. We're open to sharing our work with researchers, but please contact me off list.