[RFC] Let clang use system GPU as default offload arch for HIP

yxsamliu · December 1, 2022, 11:17pm

Currently, clang uses gfx803 as the offload arch for compiling HIP programs if not specified. This is not convenient for command line users as the compiled program will not work unless the system happens to have a gfx803 GPU. To compile a program for the current GPU on the system, users need to use some GPU detection tools and explicitly pass the --offload-arch option to clang.

One way to improve user experience is to let clang detect system GPUs and use them as default offload arch for HIP programs. A Phabricator review is opened: ⚙ D139045 [HIP] use detected GPU in --offload-arch

One concern is that this may cause dependence of clang’s behaviour on which GPU is available on a system. An alternative approach is to let clang detect GPU only if --offload-arch=auto or --offload-arch=native is specified.

Your comments are welcome.

arsenm · December 2, 2022, 2:30am

I’d prefer an explicit --offload-arch=auto or =host. I still think the correct thing to do is have a separate tool that prints the command flags to use for the detected host devices the build system can choose to invoke.

jdoerfert · December 2, 2022, 3:06am

I’d choose =native, then =auto, host is weird.

keryell · December 2, 2022, 4:13am

Currently, clang uses gfx803 as the offload arch for compiling HIP programs if not specified.

A low-tech first step could be to pick at least the default among the list of supported GPU Release Notes — ROCm 5.4.0 Documentation Home

For example the gfx906 as the cheapest from the list with the Radeon VII?

yxsamliu · December 2, 2022, 3:27pm

Thanks for pointing out that the default offload arch as gfx803 is outdated. I will update it to be gfx906.

yxsamliu · December 8, 2022, 10:43pm

Looks like most people prefer detecting system GPU under certain --offload-arch.

I will update the Phabricator review to let clang detect system GPU when --offload-arch=native is specified. The reason to use native is that -march=native is a well-known gcc option for a similar use case:

jdoerfert · December 8, 2022, 11:12pm

+1, also for =native !

Topic		Replies	Views
[RFC] Use the 'new' offloding driver for CUDA and HIP compilation by default Clang Frontend cuda , hip , gpu	27	1073	January 10, 2025
offloading to Nvidia GPUs OpenMP	0	325	December 13, 2017
clang driver defaults to -mcpu=pentium4, why? Clang Frontend	2	123	May 18, 2009
[RFC] Unified offloading option for CUDA/HIP/OpenMP Clang Frontend	25	358	March 10, 2021
GPU Target Offloading - Cannot Find GPU OpenMP	2	187	January 30, 2019

[RFC] Let clang use system GPU as default offload arch for HIP

Related topics