How to lowering gpu.launch correctly

weilinquan · December 3, 2023, 1:26pm

Hi, everyone. I have a mlir func to call gpu.launch.

func.func @gpu_func(%arg4: !gpu.async.token ){
  %c1 = arith.constant 1 : index  
  %c2 = arith.constant 32 : index 
  %2 = gpu.launch async [%arg4] blocks(%arg7, %arg8, %arg9) in (%arg13 = %c1, %arg14 = %c1, %arg15 = %c1) threads(%arg10, %arg11, %arg12) in (%arg16 = %c2, %arg17 = %c1, %arg18 = %c1) {
    gpu.terminator
  } 
  return 
}

Here is my lower pass pipeline.

${MLIR_OPT} ./test.mlir \
		--arith-bufferize \
		--finalizing-bufferize \
		--arith-expand \
		--convert-arith-to-llvm \
		--convert-gpu-to-nvvm \
		--llvm-request-c-wrappers \
		--convert-func-to-llvm

And then I get such mlir func.

module {
  func.func @gpu_func(%arg0: !gpu.async.token) attributes {llvm.emit_c_interface} {
    %0 = llvm.mlir.constant(1 : index) : i64
    %1 = builtin.unrealized_conversion_cast %0 : i64 to index
    %2 = llvm.mlir.constant(32 : index) : i64
    %3 = builtin.unrealized_conversion_cast %2 : i64 to index
    %4 = gpu.launch async [%arg0] blocks(%arg1, %arg2, %arg3) in (%arg7 = %1, %arg8 = %1, %arg9 = %1) threads(%arg4, %arg5, %arg6) in (%arg10 = %3, %arg11 = %1, %arg12 = %1) {
      gpu.terminator
    }
    llvm.return
  }
}

It seems error on builtin.unrealized_conversion_cast i64 to index. And gpu.launch operation seems not lower to nvvm ir. Are there some wrongs in my lower pass pipeline or mlir func? Thank you for your help !

mehdi_amini · December 3, 2023, 11:13pm

You are at least missing the gpu-kernel-outlining pass here to turn gpu.launch into gpu.launc_func.

Try maybe --test-lower-to-nvvm on your example?

weilinquan · December 4, 2023, 2:21am

Thanks, I got it ! I also want to know if there is a way to lower to nvvm more general, such as general pass in commandline, rather than test-lower-to-nvvm. Is this part of the commandline pass in development?

mehdi_amini · December 4, 2023, 5:54am

test-lower-to-nvvm is not a pass, it is an “example pipeline” that you can take inspiration on by looking at all the passes it invokes.

weilinquan · December 4, 2023, 6:13am

OK, Thank you very much.

Topic		Replies	Views
Failed to lower GPU dialect using gpu-lower-to-nvvm pipeline MLIR	5	240	February 29, 2024
Constructing pipeline lowering an affine parallel loop to NVIDIA GPU MLIR gpu	4	446	June 6, 2023
Failure to lower private GPU memory MLIR	2	106	May 3, 2024
Failure lower gpu.alloc with global memory MLIR	2	131	May 27, 2024
[Question] How to properly lower MLIR to LLVM-IR? MLIR	2	931	April 21, 2022

How to lowering gpu.launch correctly

Related topics