LLVM/CLANG: CUDA compilation fail for inline assembly code

Hi,

I am sorry for sending this query again here, but maybe I sent it to wrong list yesterday.

I am trying to compile LonestarGPU-rev2.0 benchmark suite with LLVM/CLANG.

This suite has a following piece of code (more info here):

  • static device uint get_smid(void) {

  • uint ret;

  • asm(“mov.u32 %0, %smid;” : “=r”(ret) );

  • return ret;

  • }

The original make file has nvcc compiler with a flag -Xptxas -v. It compiles with nvcc.

LLVM has -Xcuda-ptxas , which I believe is the comparable command for compiling PTX code. I get following error when I try compiling (clang 4.0).:

  1. …/…/include/cutil_subset.h:23:25: error: invalid % escape in inline assembly string

  2. asm(“mov.u32 %0, %smid;” : “=r”(ret) );

It points to %smid.

I have been trying to figure out what is this error is but NVIDIA PTX has this.

Is this a bug or something?

Thanks.

-Guru

Okay, so as I understand, LLVM inline assembly style is different than the GCC/NVPTX assembly style. So as per LLVM language reference manual following constraint codes are supported:

  • b: A 1-bit integer register.
  • c or h: A 16-bit integer register.
  • r: A 32-bit integer register.
  • l or N: A 64-bit integer register.
  • f: A 32-bit float register.
  • d: A 64-bit float register.
    Now, I am just wondering if there are equivalent constraints for %smid and %warpid? As per the NVIDIA documentation it is unsafe to rely on this information provided by %smid and %warpid, but for compiling an existing cuda code I will need to do that.

Any suggestions? Work in progress?

Thank you.

-Guru

Hi,

I am just following up on my question.

Meanwhile I looked up the ‘include/llvm/IR/IntrinsicsNVVM.td’ and found there is a definition for smid (declare i32 @llvm.nvvm.read.ptx.sreg.smid).

At this page, there is code for kernel.ll. I am lost here because this looks like LLVM code completely and not inline asm.

I also tried a cpp-only program with simple asm (so now this is gcc compatible code) and found that LLVM can compile it (may be this was a too simplified test program).

Can someone please point me in the right direction?

Thank you.

-Guru

Hi, Gurunath.

If you change this to %%smid, it seems to compile fine. I checked the
PTX and it looks right.

We may be able to fix this in clang, but at first glance it appears
nontrivial. NVCC seems to accept "%%smid" and do the right thing with
it, so it's probably just better to change the code.

-Justin