Convert NVIDIA GPU LLVM IR(NVVM) alloca instruction to AMDGPU's

I use clang++ and hipcc to generate LLVM IR for NVIDIA GPU and AMD GPU. They generate different alloca statements.
Here is an example:

__device__ void int_a_kernel() {
    int a = 1;
}

NVIDIA GPUs LLVM IR:

define dso_local void @_Z12int_a_kernelv() #0 {
  %1 = alloca i32, align 4
  store i32 1, i32* %1, align 4
  ret void
}

But AMDGPU’s has more memory info

define dso_local void @_Z12int_a_kernelv() #0 {
  %1 = alloca i32, align 4, addrspace(5)
  %2 = addrspacecast i32 addrspace(5)* %1 to i32*
  store i32 1, i32* %2, align 4
  ret void
}

I wonder is there a way to convert LLVM ir for NVIDIA GPU to ir for AMDGPU or a way to add the addrspace and addrspacecast instructions?

Clang has an undocumented __attribute__((address_space(N))) you can try using that.
See [0] for the semantics of different address space numbers.

[0] clang: include/clang/Basic/AddressSpaces.h Source File

1 Like

Thank you very much for your reply and the useful information you provided. But I would prefer to have a way to handle LLVM IR directly.

I believe NVPTX does eventually add device-specific ASCs – see llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp

1 Like

Thanks! I think that will help me a lot

It’s not more information, it’s a fundamental part of the value.

The NVPTX alloca handling is basically an old hack to avoid changing the IR to keep alloca producing a generic pointer where the value is really allocated in ADDRESS_SPACE_LOCAL. NVPTXLowerAlloca introduces a pair of casts to hack in some of the optimization benefit of the specific address space. Really replacing the address space of the value requires transitively rewriting all users

1 Like

Thanks! This conflicts somewhat with my intuition for optimizing CUDA programs. For instance, when I use a shared-memory to accelerate repeated memory access, the pointer which points to shared memory in llvm ir will be converted to a generic pointer. However, it works.

I guess nvvm uses annotations to contain information about address space.