Why does clang parse device/global functions once again when compiling for host target?

aywala · November 27, 2024, 3:11am

#ifdef __HIP_DEVICE_COMPILE__
    #define HD __attribute__((host)) __attribute__((device))
#else
    #define HD
#endif

HD void foo(){
    
}

__attribute__((global)) void kernel(){
    foo();
}

Clang report an error about global function when compiling for host target. The device code has already been generated. It does not make any sense to make this error.

#include <stdio.h>

#ifdef __CUDA_ARCH__
    #define HD __host__ __device__
#else
    #define HD
#endif

HD void foo() {
    printf("execute foo\n");
}

__global__  void kernel(){
    foo(); 
}

int main() {
    kernel<<<1,1>>>();
    cudaDeviceSynchronize();
}

This code sample is ok with nvcc.

shiltian · November 27, 2024, 5:29am

If you compile the CUDA code using clang, you’ll observe the same results as with HIP. This is because clang builds a complete AST during its two-pass compilation process from the same source file.

My understanding of nvcc is that it splits the source code before compilation, with the split parts being fed separately to different compilers (for host and device).

aywala · November 27, 2024, 6:29am

You are right. But why not skip device functions when compiling for host target?

erichkeane · November 27, 2024, 1:44pm

We just don’t really have a mechanism for doing so or identifying it. You can’t really skip functions effectively, as finding just the close ‘}’ can be challenging, so we don’t bother. 2-pass compilation has a problem with this, and it is a downside to the architecture.

Topic		Replies	Views
Heterogeneous target attributes overloading in Clang CUDA (__CUDA_ARCH__ considered harmful) Clang Frontend	1	82	November 7, 2018
[gpucc] relationship between host and device IR for global __device__ variable? Clang Frontend	1	84	June 29, 2018
Cannot pass __device__ function as template parameter in CUDA? Using Clang cuda , gpu	3	988	June 28, 2022
CUDA Support for clang-tidy clang-tidy cuda	6	723	July 25, 2022
Parsing CUDA AST using clang Clang Frontend	2	108	March 10, 2019

Why does clang parse device/global functions once again when compiling for host target?

Related topics