Backend vs JIT : GPU

Gopal_Rastogi · October 9, 2013, 3:13pm

Hi guys,

I am understanding OpenCL compilation flow on GPU in order to develop OpenCL runtime for a new hardware.

I understood that OpenCL compiler is part of a vendor’s runtime library which is the heart of OpenCL. Since OpenCL kernel is compiled at runtime, hence at high level its compilation takes place in two steps:

i. source code is first converted to intermediate code.

ii. intermediate code is then translated to targeted binary code.

let say for example, we have a OpenCL kernel source code vectorAdd_kernel.cl :

OpenCL compilation flow on Nvidia GPUs

a. vectorAdd_kernel.cl is first translated to LLVM IR using clang and

b. LLVM IR is converted into optimized LLVM IR using LLVM optimizer.

b. optimized LLVM IR is then translated to vectorAdd_kernel.ptx using Back-end

c. vectorAdd_kernel.ptx is then translated to vectorAdd_kernel.bin file using JIT. Nvidia uses JIT to get benefit in-case when next-generation GPUs are encounterd.

OpenCL compilation on AMD GPUs

a. vectorAdd_kernel.cl is first translated to LLVM IR using gcc/clang

b. LLVM IR is then converted into optimzed LLVM IR using LLVM optimizer.

c. optimized LLVM IR is then converted into AMD IL.

d. AMD IL is then converted into AMD ISA using shader compiler (GPU JIT).

I understand that AMD uses back-end compilation as part of JIT, instead Nvidia which uses back-end separate from JIT.

Is that correct? If it is so then what are the advantages of using JIT separate from back-end?

Thanks for your comments/opinions,

-Gopal

dmikushin · October 9, 2013, 3:38pm

Hi Gopal,

The reason is absence/presence of open-source IR->ISA translation component.

1.c vectorAdd_kernel.ptx is then translated to vectorAdd_kernel.cubin containing device-specific binary assembly. Translation is performed either by NVIDIA CUDA runtime library (see e.g. cuModuleLoad), which is referred as JIT, or with ptxas command line tool. In both cases, translation stage involves closed-source components of NVIDIA CUDA toolkit, which are not part of LLVM. There are some alternatives, such as NVVM, asfermi, and PathScale.

AFAIK, AMD pipleline in contrast has two options: closed-source (Catalyst) and open-source driver.

Best,

D.

Micah_Villmow · October 9, 2013, 3:44pm

Gopal,

I gave a presentation on how AMD compiles here:

http://llvm.org/devmtg/2010-11/Villmow-OpenCL.pdf

Micah

Topic		Replies	Views
Backend for C and OpenCL Clang Frontend	6	104	October 5, 2011
AMDGPU mimics JIT? LLVM Dev List Archives	1	76	February 25, 2020
LLVM JIT Compilation LLVM Dev List Archives	0	71	August 16, 2017
OpenCL kernel to bitcode LLVM Dev List Archives	2	59	February 10, 2009
Executing OpenMP 4.0 code on Nvidia's GPU LLVM Dev List Archives	4	74	January 21, 2016

Backend vs JIT : GPU

Related Topics