Hey,
I am working on GPU CodeGen for PowerVR platform. I started with IREE because IREE has implemented many interesting conversions for LinalgToSPIRV. I have a question with regard to IREE (5a51d2a57 Jul 18), it looks like LinalgTilingOnBuffers skips ‘gpu.func’, for example:
gpu.module @kernels {
gpu.func @matmul(%arg0: memref<16x16xf32>, %arg1: memref<16x16xf32>, %ret0: memref<16x16xf32>) {
linalg.matmul %arg0, %arg1, %ret0 :
(memref<16x16xf32>, memref<16x16xf32>, memref<16x16xf32>)
gpu.return
}
}
I run the command
iree-opt -split-input-file \
-iree-codegen-linalg-tile-and-fuse \
-iree-codegen-convert-to-gpu -canonicalize -cse \
test.mlir
But the output is unchanged. So I change my work flow to this: I first use iree-opt to compile the non gpu MLIR function, then use mlir-opt to lower affine IR to standard IR, for example
iree-opt -split-input-file \
-iree-codegen-linalg-tile-and-fuse \
-iree-codegen-convert-to-gpu -canonicalize -cse \
test.mlir | \
mlir-opt -lower-affine
// test.mlir
module attributes {
spv.target_env =
#spv.target_env<#spv.vce<v1.3, [Shader], [SPV_KHR_storage_buffer_storage_class]>,
{max_compute_workgroup_invocations = 128 : i32,
max_compute_workgroup_size = dense<[128, 128, 64]> : vector<3xi32>}>} {
func @matmul(%arg0: memref<16x16xf32>, %arg1: memref<16x16xf32>, %ret0: memref<16x16xf32>) {
linalg.matmul %arg0, %arg1, %ret0 :
(memref<16x16xf32>, memref<16x16xf32>, memref<16x16xf32>)
return
}
}
Then I Manually copy the result to a stub code, for example:
#map0 = affine_map<(d0, d1)[s0] -> (d0 * 16 + s0 + d1)>
module attributes {
gpu.container_module, spv.target_env = #spv.target_env<#spv.vce<v1.3, [Shader],
[SPV_KHR_storage_buffer_storage_class]>,
{max_compute_workgroup_invocations = 128 : i32,
max_compute_workgroup_size = dense<[128, 128, 64]> : vector<3xi32>}>} {
gpu.module @kernels {
gpu.func @matmul(..) kernel .. { .. }
.. (manual copy paste) ..
gpu.return
}
func @main() {
...
"gpu.launch_func"(%cst4, %cst4, %cst1, %cst1, %cst1, %cst1,
%arg0, %arg1, %arg2)
{ kernel = @kernels::@matmul }
: (index, index, index, index, index, index,
memref<16x16xf32>, memref<16x16xf32>, memref<16x16xf32>) -> ()
return
}
}
And execute the stub code with vulkan runner
mlir-vulkan-runner \
--shared-libs=libvulkan-runtime-wrappers.so,libmlir_runner_utils.so \
--entry-point-result=void -print-ir-after-all \
stub.mlir
I wonder if there is a better way to prevent this manyal copy? Or is there any option in MLIR or IREE to mark certain functions as gpu.func
Thanks