I tried to compile a CUDA program using clang-3.4. This program is taken
from NVIDIA_CUDA-5.0_samples collection and it’s a very simple program adding
two vectors.
A few modifications to the original code were done, including
I substituted the global CUDA C keyword with attribute((global))
in order to use clang as the compiler.
<stdlib.h> <math.h> were added.
declarations of blockDim, blockIdx, threadIdx were added.
// ==================== code begin ========================
/**
Clang cannot yet compile CUDA out-of-the-box like that. Definitely not
mixed CUDA code (where host and device code are in the same file). Clang
can be made, with some effort, to compile stand-alone device code, but some
critical steps are missing. For example, you have to map threadIdx and
other similar special globals to appropriate intrinsic calls
(@llvm.nvvm.read.ptx.sreg.tid.*), and not just declare them.
It sounds that at the moment support to CUDA in Clang is far from
production use …
I’d like to know what the status of CUDA support is in clang,
but I am not able to find anything reporting this.
Are you a developer of this part, or could you give me some
guidance?
It sounds that at the moment support to CUDA in Clang is far from
production use ...
I'd like to know what the status of CUDA support is in clang,
but I am not able to find anything reporting this.
Are you a developer of this part, or could you give me some
guidance?
There's no documentation of these parts of Clang, as far as I know, besides
the source code. To get a feel for what's supported take a look at the
existing tests (specifically test/SemaCUDA and test/CodeGenCUDA dirs).