TLDR;
This RFC proposes adding GPU compilation support for OpenMP Target offload constructs within MLIR, i.e., not requiring flang or clang to compile OMP MLIR programs.
The idea is to leverage the existing compilation infrastructure in the GPU
dialect to enable OpenMP compilation.
Edit:
This proposal it’s not about lowering OMP operations to the GPU dialect, instead it’s about using the GPU dialect compilation infrastructure to get to executable using only MLIR, the OpenMPIRBuilder, libomptarget*.bc
and the existing OpenMP runtime.
Why?
Currently, there is no way to compile MLIR OMP target offload ops for GPU targets without using flang
or clang
. This lack of GPU support has multiple consequences:
- There are no integration tests for OMP target offload, as testing it 100% within MLIR is impossible.
- Dialect development becomes more complicated than needed, as one requires
flang
to support the OMP constructs in the front end and then test them, creating a development barrier. - The OMP dialect should be almost fully supported within MLIR.
Proposal:
Major:
- Add the
OffloadEmbedding
GPU compilation attribute. This attribute translates GPU binaries in a way that’s compatible with Libomptarget, CUDART and HIP RT. This attribute could then be used in combination with project offload to have a more general GPU runtime. - Add the
omp.tgt_entry_info
attribute for representing Target Entry information. This makes the entry explicit in the IR, making the mapping between host and device symbols easier. - Add the
omp-target-outline-to-gpu
pass. This pass outlinesomp.target
ops to a GPU module, making it possible to leverage GPU compilation infrastructure.
All:
- See the Github PR section.
The proposed set of PRs would also enable JIT compilation with mlir-cpu-runner
*, meaning that integration tests would now be possible.
- There’s a small bug where the cl option
march
gets registered twice, one bymlir-cpu-runner
and the other as a consequence oflibomptarget
. But, if the double registration is avoid then it works.
Small example:
Host only module:
module attributes {omp.is_target_device = false, omp.is_gpu = false} {
func.func @targetFn() -> () attributes {omp.declare_target = #omp.declaretarget<device_type = (any), capture_clause = (to)>} {
return
}
llvm.func @main() {
omp.target {
func.call @targetFn() : () -> ()
omp.terminator
}
llvm.return
}
}
After applying mlir-opt --omp-target-outline-to-gpu
the omp.target
ops get outlined, as well as the declare target symbols. Furthermore the entry information becomes explicit:
module attributes {gpu.container_module, omp.is_gpu = false, omp.is_target_device = false} {
gpu.module @omp_offload <#gpu.offload_embedding<omp>> attributes {omp.is_gpu = true, omp.is_target_device = true} {
func.func @main() attributes {omp.outline_parent_name = "main"} {
omp.target info = #omp.tgt_entry_info<deviceID = 64771, fileID = 12453258, line = 6, section = @omp_offload> {
func.call @targetFn() : () -> ()
omp.terminator
}
return
}
func.func @targetFn() attributes {omp.declare_target = #omp.declaretarget<device_type = (any), capture_clause = (to)>} {
return
}
}
func.func @targetFn() attributes {omp.declare_target = #omp.declaretarget<device_type = (any), capture_clause = (to)>} {
return
}
llvm.func @main() {
omp.target info = #omp.tgt_entry_info<deviceID = 64771, fileID = 12453258, line = 6, section = @omp_offload> {
func.call @targetFn() : () -> ()
omp.terminator
}
llvm.return
}
}
Example from flang:
I took:
program main
integer :: x;
integer :: y;
x = 0
y = 1
!$omp target map(from:x)
x = 5 + y
!$omp end target
print *, "x = ", x
end program main
- Saved the
*-llvmir.mlir
file produced byflang
. Result: test-lvmir.mlir. - Applied the outlining pass with
mlir-opt
. Result: test-outlined.mlir - Applied
GPU
dialect compilation passes withmlir-opt
, and then translated tollvm
. Result: omp.ll - Compiled the IR and then ran it on a NVIDIA V100. Results: exec.log
Github PR:
Shoutout to @jhuber6 for the linker-wrapper
work. As this proposal relies heavily on it.
CC’ing people that might be interested in the proposal:
@jdoerfert
@kiranchandramohan
@clementval
@jeanPerier
@banach-space
@ftynse
@mehdi_amini
@grypp