PTX generation from CUDA file for compute capability 1.0 (sm_10)

Hello,

When generating the PTX output from CUDA file(.cu file), the minimum target that is accepted by LLVM is sm_20. But I have a specific requirement to generate PTX output for compute capability 1.0 (sm_10). Is there any previous version of LLVM supporting this?

Thank you,
Ginu

Hi Ginu,

I don’t believe so, no. Art and Justin would know best here I think if there were or how difficult it would be to target that far back.

-eric

What happens if you hack change llvm to accept sm_10? Do you get an
error somewhere further down the pipeline?

sm_10 is pretty old hardware - Why the strong dependency on this?

Hello Bergström/Eric,

Thanks for the reply. The G80(sm_10) architecture was ported on FPGA by a group of researchers (http://www.ecs.umass.edu/ece/tessier/andryc-fpt13.pdf). Our group have some further research interest on this work. I was working on modifying the Clang-LLVM for a couple of months and achieved the required changes. But Clang-LLVM is only allowing me to generate PTX for sm_20, sm_30 etc.While trying to generate PTX for sm_10, it gave

error: unknown target CPU ‘sm_10’

fatal error: cannot open file ‘/tmp/shared-395893.s’: No such file or directory
1 error generated.

The compilation command used is:
clang -Xclang -I$LIBCLC/include/generic -I$LIBCLC/include/ptx -Dcl_clang_storage_class_specifiers -O3 CudaSource.cu -S -o PtxOutput.ptx --cuda-gpu-arch=sm_10

Is there any chance that this error being generated from CUDA runtime alone since I am using CUDA 7.5 which does not support sm_10. If there is any chance that the error is isolated from LLVM and is only due to CUDA, i have some hope to use a lower CUDA version. Please let me know your suggestions.

Thank you,
Ginu

Hi, Ginu.

No earlier version of llvm supports sm_10. It’s not something I have looked at deeply, but I expect adding support would be nontrivial, because one would have to teach the nvptx backend which machine instructions are and are not available in that architecture.

Regards,
-Justin

I think you may have answered your own question - If CUDA 7.5 doesn't
support sm_10 then you'll eventually hit that as a problem. So yes if
possible make sure you're installing a version of CUDA toolkit which
supports the target you need.

I'm not certain the exact compilation flow when fighting this, but
ensure the underlying nvcc and ptxas work first for sm_10 before
trying to get llvm involved.

Previously when I started with installation of LLVM and CUDA, I had a very bad time with the error “Unsupported CUDA version!”. I remember it now and dig down a bit into the LLVM source code. In the LLVM file llvm/tools/clang/lib/Headers/__clang_cuda_runtime_wrapper.h
(http://clang.llvm.org/doxygen/cuda__runtime_8h_source.html)
there is a check for the CUDA_VERSION and it allows only versions between 7 and 7.05. So is there a handy way of checking this file for previous LLVM versions so that I can avoid heading into a catastrophe which I am certain about and give a better chance for success…

You may be out of luck. http://reviews.llvm.org/rL178417 committed in 2013 says:
“[NVPTX] Remove support for SM < 2.0. This was never fully supported anyway.”

–Artem

Hello Artem,

Thanks for the mail. I think I may need to dig down a little bit to see which part is supported and try out some way to make the compilation happen.

Best regards,
Ginu

I located the Clang/LLVM release version 3.2 having some code for sm_10. Even though it is know that the complete CUDA compilation is not possible with this version of Clang/LLVM for sm_10 GPU, is there a way to know what exactly is supported in this version for sm_10.

Best regards,
Ginu