How to use llvm to compile CUDA to IR?

How to use llvm to compile CUDA to IR?

I have got as far as llvm to ptx, ie/eg:

clang+±3.8 -I/usr/local/cuda-7.5/include llvm-sample.cu -S -o llvm-sample.ll

(except the “.ll” is actually ptx…)

adding -emit-llvm causes ir to be emitted, instead of ptx. However, it’s only emitting the host-side ir, and I need device side ir. (Well… I need both, but device side is pretty high priority…). Currently I’m doing:

clang+±3.8 -I/usr/local/cuda-7.5/include llvm-sample.cu -emit-llvm -S -o llvm-sample.ll

How to emit device-side ir?