Hi, everyone. I have a mlir func to call gpu.launch.
func.func @gpu_func(%arg4: !gpu.async.token ){
%c1 = arith.constant 1 : index
%c2 = arith.constant 32 : index
%2 = gpu.launch async [%arg4] blocks(%arg7, %arg8, %arg9) in (%arg13 = %c1, %arg14 = %c1, %arg15 = %c1) threads(%arg10, %arg11, %arg12) in (%arg16 = %c2, %arg17 = %c1, %arg18 = %c1) {
gpu.terminator
}
return
}
Here is my lower pass pipeline.
${MLIR_OPT} ./test.mlir \
--arith-bufferize \
--finalizing-bufferize \
--arith-expand \
--convert-arith-to-llvm \
--convert-gpu-to-nvvm \
--llvm-request-c-wrappers \
--convert-func-to-llvm
And then I get such mlir func.
module {
func.func @gpu_func(%arg0: !gpu.async.token) attributes {llvm.emit_c_interface} {
%0 = llvm.mlir.constant(1 : index) : i64
%1 = builtin.unrealized_conversion_cast %0 : i64 to index
%2 = llvm.mlir.constant(32 : index) : i64
%3 = builtin.unrealized_conversion_cast %2 : i64 to index
%4 = gpu.launch async [%arg0] blocks(%arg1, %arg2, %arg3) in (%arg7 = %1, %arg8 = %1, %arg9 = %1) threads(%arg4, %arg5, %arg6) in (%arg10 = %3, %arg11 = %1, %arg12 = %1) {
gpu.terminator
}
llvm.return
}
}
It seems error on builtin.unrealized_conversion_cast i64 to index. And gpu.launch operation seems not lower to nvvm ir. Are there some wrongs in my lower pass pipeline or mlir func? Thank you for your help !