[MLIR/GPU] 'CUDA_ERROR_INVALID_HANDLE' in MLIR GPU execution

To execute MLIR on GPUs, I compiled the test.mlir file. The following is part of the IR :

    func @test(){
        %cst_0 = arith.constant 0.0 : f32
        %2 = memref.alloc() {alignment = 16 : i64} : memref<1x6x28x28xf32>
           affine.for %arg1 = 0 to 1 { 
             affine.for %arg2 = 0 to 6 {  
               affine.for %arg3 = 0 to 28 {  
                 affine.for %arg4 = 0 to 28 {  
                   ··················
                  memref.store %15, %2[%arg1, %arg2, %arg3, %arg4] : memref<1x6x28x28xf32>
                   ··················
                }}}}
           affine.for %arg1 = 0 to 1 {
             affine.for %arg2 = 0 to 6 {
               affine.for %arg3 = 0 to 28 {
                 affine.for %arg4 = 0 to 28 {
                    ··················
                %11 = memref.load %2[%arg1, %arg2, %arg3, %arg4] : memref<1x6x28x28xf32>
                    ··················
              }}}}
        return 
    }

Using the --convert-affine-for-to-gpu --lower-affine --convert-scf-to-cf --gpu-kernel-outlining --pass-pipeline="gpu.module(strip-debuginfo, convert-gpu-to-nvvm, gpu-to-cubin)" -gpu-to-llvm , I compiled the test.mlir down to a binary and executed. The error occurs:

'cuStreamSynchronize(stream)' failed with 'CUDA_ERROR_ILLEGAL_ADDRESS'
'cuStreamDestroy(stream)' failed with 'CUDA_ERROR_ILLEGAL_ADDRESS'
'cuModuleUnload(module)' failed with 'CUDA_ERROR_ILLEGAL_ADDRESS'
'cuModuleLoadData(&module, data)' failed with 'CUDA_ERROR_ILLEGAL_ADDRESS'
'cuModuleGetFunction(&function, module, name)' failed with 'CUDA_ERROR_INVALID_HANDLE'
'cuStreamCreate(&stream, CU_STREAM_NON_BLOCKING)' failed with 'CUDA_ERROR_ILLEGAL_ADDRESS'
'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, smem, stream, params, extra)' failed with 'CUDA_ERROR_INVALID_HANDLE'
'cuStreamSynchronize(stream)' failed with 'CUDA_ERROR_INVALID_HANDLE'
'cuStreamDestroy(stream)' failed with 'CUDA_ERROR_INVALID_HANDLE'
'cuModuleUnload(module)' failed with 'CUDA_ERROR_INVALID_HANDLE'

I wonder if the %2 = memref.alloc() operation causes the CUDA_ERROR_ILLEGAL_ADDRESS , which is the host code. When compiled to GPU beckend , the memory address is not registered to GPU. By the way, how can I allocate GPU memory in mlir dialect ?

Such errors are usually indicative of a misconfigured CUDA installation or an improper linking/connection to the driver.

The memrefs you are trying to load/store from/to aren’t on the GPU I think; you will see such errors in that case.

There is an existing bug for this error behavior. As @bondhugula suggests, this is likely due to the memory not being accessible from the GPU.

In my case, inserting gpu.host_register <the memref> before gpu.launch* resolved the error.