[AMD Public Use]
Currently CUDA/HIP and OpenMP has different offloading options, e.g.
clang++ -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx900 test.cpp
clang++ -offload-arch=gfx906 test.hip
Our users request to have a concise way to specify offloading options for OpenMP. Ideally, one option to convey offloading kind, offloading triple, and offloading device arch.
On the other hand, there are some limitations of the current offloading option for CUDA/HIP:
1. It does not specify offloading kind whereas relies on file type to infer offloading kind. If input file is not CUDA/HIP source code (e.g. bundled LLVM bit code), there needs a way to specify offloading kind.
2. It does not specify offloading target triple whereas relies on device arch to infer target triple. As HIP is ported to different targets, there needs a way to specify offloading target triple.
In summary, a unified offloading option is preferred, which conveys offloading kind, offloading target triple and offloading device arch.
I would like to propose to either have a new option or extend the existing -offload-arch option for that, in the format kind-triple-arch, e.g.
Whereas kind and triple can be abbreviations for conciseness, e.g. omp expands to openmp, amd expands to amdgcn-amd-amdhsa. Arch can be omitted, in which case clang will use the default arch for the triple.
Your feedbacks are welcome.