AMDGPU mimics JIT?


I'm looking into ways how to port our application to machines using AMD GPUs.

The way the application is currently set up for accelerated computing is the following:

1) LLVM IR modules are being built in memory during runtime

2) In case of NVIDIA GPUs each module is compiled with the NVPTX backend and the assembly file (PTX) extracted

3) The PTX is loaded with the NVIDIA kernel driver which JIT-compiles it to the actual GPU installed, and the kernel is launched on the GPU for execution.

Now, AMD doesn't seem to use an intermediate IR level comparable to PTX. As far as I understand the AMDGPU backend generates binary code (AMGCN) for the GPU kernels directly. This makes me wonder if there is any way to execute (launch) such a kernel after it has been compiled by the AMDGPU backend.

It will certainly not work in the typical HIP way using the ROCM utilities where a kernel is specified with the __global__ attribute like

__global__ void kernel(const T* in, T* out) {}

From the AMDGPU kernel's compilation at most a raw pointer is returned (and that is only in the case if the backend supports JIT, which it probably doesn't!?) otherwise it produces a library. Is it possible to 'dlload' such a (kernel) library into the address space and launch it?

Anyone has any idea how to launch such a kernel within the same program context/execution?



The closest more or less existing design to your definition is PGI OpenACC. Radeon target no longer exists in it for obvious reasons. But as long as it deploys an LLVM-based GPU pipeline, you might have non-zero luck to intercept LLVM IR before lowering to PTX and redirect it to AMDGPU. This of course shall involve writing quite a bit of glueing code to translate nvvm intrinsics to red equivalents. As a result, you will get OpenACC support, which might not that generic thing as you want. But if I were you, I’d take this approach as the one potentially most successful in reasonable time.

Kind regards,

  • Dmitry.

вт, 25 февр. 2020 г. в 18:17, Frank Winter via llvm-dev <>: