In the GPU dialect, most operations have the option of being predicated on previous operations completing through async tokens. At the same time, most GPU operations appear to require a token to be lowered to LLVM (see all the uses of llvm-project/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp at main · llvm/llvm-project · GitHub in this file).
From looking at the dialect documentation, I don’t see a way to conjure a token for an operation with no predecessors. Consider the following psuedocode:
func.func() {
%0 = gpu.alloc() ...
gpu.launch (something that uses %0)
}
I can predicate something on the result of the gpu.launch, but I don’t know how to give a token to that first gpu.alloc()
– without it, the AllocOp
cannot be lowered to LLVM. I’ve been working around this by hacking the GPU to LLVM converter to pass null pointers into the gpu runtime layer if an async dependency isn’t present, but this doesn’t seem like the best way to do things.