Actually, this brings up the larger question of what to do about ops that might not exist in std but still might have a reasonable intrinsic in LLVM/SPIR-V/CUDA/etc. I’d like to be able to have a placeholder op at a high level but then at some later lowering decide that this should turn into a runtime call or some microcode.
So far, operations were introduced to std
in a need-driven case-by-case way. While it would be nice to have general guidelines on what gets included into the std
dialect, it requires defining what std
dialect is in the first place, and it has proved to be quite contentious in the past (I personally think we should rather have composable dialects with more meaningful inclusion criteria than “standard”). Since new operations are only added every once in a while, we managed discussing them on a case-by-case basis until now.
Adding sqrtf
as unary arithmetic operation that extends pointwise to tensors, similarly to other operations in std
, sounds good to me. I wouldn’t treat it as a “placeholder”, it should have specific semantics.
+1. This should allow vector types as well. There are hardware instructions (eg. vsqrtpd) that compute square root on vectors (so it’s “low-level” enough), and on the LLVM path, the llvm.sqrt intrinsic could be used to readily lower it.
Just to provide some precedence: We have recently added ExpOp
to standard, which lowers differently depending on whether going to llvm IR, NVVM or ROCDL. For proper llvm, we convert into an LLVM::ExpOp
. The GPU specific lowering is handled by the pattern defined in lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h
.
We steer the conversion by declaring LLVM::ExpOp
is illegal when lowering from Standard to LLVM + NVVM. See lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp:715
for an example.
We plan to also add rsqrt
at some point. Unless someone beats us to it