Documentation on CUDA Support

As requested, Artem and I started some documentation on the CUDA support in LLVM. The CUDA support is not fully ready yet (some patches pending as the doc mentioned), but in a pretty reasonable shape for early adopters.

We will document more content down the road such as CUDA-specific compilation flags, performance numbers, and performance tuning instructions. At the same time, feel free to improve it and contribute more documentation as you see appropriate. Thanks!

Jingyue

Thanks for all the hard work!