I think we should add llvm intrinsics for the following operations:
- acos (C++, HLSL)
- asin (C++, HLSL)
- atan (C++, HLSL)
- cosh (C++, HLSL)
- sinh (C++, HLSL)
- tanh (C++, HLSL)
- tan (C++, HLSL)
- atan2 (C++, HLSL)
- fmod (C++, HLSL)
- frexp (C++, HLSL)
- ldexp (C++, HLSL)
- modf (C++, HLSL)
- copysign (C++, HLSL)
- dot (HLSL)
- rsqrt (HLSL)
- clamp/uclamp (HLSL)
My primary motivation here is for HLSL support in clang, but they’re generally useful anywhere where we (1) need vector expansions of these operations, (2) want to handle these operations even in freestanding environments, or (3) have high level operations in the target that we want to preserve.
HLSL and the DirectX and SPIR-V backends overlap with all three of these cases.
This list mostly consists of C stdlib math functions that have corresponding operations in HLSL, with the exception of dot
, rsqrt
, and clamp
. These three don’t have C standard equivalents but are similarly generic and come up a lot in GPUs. Note that we could separate the non-C ones into a separate proposal if folks want separate justification for those three.
Some History
While we’ve had math intrinsics like llvm.sin since time immemorial, there’s generally been contention around adding more of them. There are a couple of reasons for this:
- For most targets these are just a libcall anyway, so it’s kind of
pointless. - There are a bunch of places where we (used to) treat intrinsics
specially rather than look at function attributes.
IMO this has been keeping us in a place that’s the worst of both worlds.
Why Now?
There are two main things that are different today from some of the times we’ve discussed this in the past.
-
We have targets that have these operations. The SPIR-V and DirectX backends need to maintain the high level function in their output.
-
We have languages that want vector versions of all of these operations. OpenCL and HLSL both need vector versions of all of these.
Additionally, any approach to handling this differently is made awkward by the fact that we already have the intrinsics for many math intrinsics. If we don’t add these intrinsics a comprehensive lowering solution to the DirectX or SPIR-V backend has to handle both library call recognition and intrinsics, whereas an intrinsics only solution is much simpler.
Alternatives
The most reasonable alternative to this solution would be to remove the set of math intrinsics that currently exist, and settle on a library call recognition based approach across the board. While I do think there are advantages to this idea, the scope of the change would be enormous, and I don’t think it’s practical to hold up progress on a redesign of parts of LLVM that have been in place for nearly 20 years. Also note that if such a redesign ever did occur having a handful more intrinsics to deal with wouldn’t make it substantially more or less difficult.
The other alternative to this solution is just much worse: implement all of these math functions as target intrinsics in the DirectX and SPIR-V backends, and any other backend that wants them. We would result in a huge amount of duplication with very little benefit IMO.
Conclusion
Adding these intrinsics is pragmatic and avoids accruing technical debt from the hoops that we’d have to jump through without them. We should add these 16 intrinsics generically to LLVM.