does clang support CUDA?

Hi, I have been trying to compile a snip of CUDA. but I keep on getting such
a silly error. just wondering whether clang does support to compile CUDA ?
or there is other issue associated with error? if there is anyone could
help.

here is the following error :

hello_labeled.cu:46:45: *error: kernel call to non-global function
square_array*
  square_array <<< n_blocks, block_size >>> (a_d, N);

thanks

Anwarul