Right now, there is a massive rats’ nest and pull request war around the semantics and handling of LLVM’s floating-point minimum and maximum operations, of which there are three different sets.
The LangRef as it is currently live on llvm.org at the time of writing (again, pull request war) lists the following sets of operations:
llvm.minnum.* and llvm.maxnum.*
These operations are defined to match the minNum and maxNum operations defined by IEEE 754-2008, with the exception that they always treat -0.0 as less than +0.0. If one operand is a qNaN but not the other, the non-NaN one is returned.
If either input operand is a signaling NaN, these operations return a qNaN. This means that LLVM’s sNaN/qNaN nondeterminism “leaks” into the operations’ semantics, which appears to make them impossible to implement soundly. The “Semantics” section somewhat acknowledges this.
These operations are not lowered correctly on many backends. For instance, the x86 backend does not distinguish between -0.0 and +0.0, and doesn’t treat sNaN differently from qNaN.
llvm.minimumnum.* and llvm.maximumnum.*
These operations are defined to match the minimumNumber and maximumNumber operations defined by IEEE754-2019, except using LLVM’s NaN semantics instead of IEEE754’s. They can do this because the newly-revised operations do not treat sNaN inputs differently from qNaN inputs. LLVM’s language reference says that if both operands are NaN, these operations return “a NaN”, linking to the section about LLVM’s semantics and how it treats qNaN and sNaN identically.
These operations also treat -0.0 as less than +0.0, but this time it’s actually lowered correctly on x86. This behavior can be opted out of via the nsz flag.
llvm.minimum.* and llvm.maximum.*
These operations are defined to match the minimum and maximum operations defined by IEEE754-2019, except using LLVM’s NaN semantics instead of IEEE754’s.
These behave identically to llvm.minimumnum.* and llvm.maximumnum.*, with the exception that they return a NaN if either input operand is NaN.
The problem
Right now, the semantics of llvm.minnum.* and llvm.maxnum.* are not actually respected across architectures. This was noted in Revert "LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852)" by nikic · Pull Request #138451 · llvm/llvm-project · GitHub :
I’m not a big fan of the change itself (because it is incoherent with LLVM’s general sNaN semantics), but even if we want to do it, it needs to be phased in a lot more carefully than what has actually happened. You can’t just change the semantics in LangRef and then completely ignore the consequences of the change on optimization behavior.
The bigger problem, also included in that quote above, is that LLVM’s NaN semantics are are in direct contradiction with IEEE754-2008’s minNum and maxNum operations. LLVM’s specification directly says “Floating-point math operations are allowed to treat all NaNs as if they were quiet NaNs.” The minNum and maxNum operations cannot treat all NaNs as if they were quiet NaNs.
My proposal
(EDIT: I’ve revised this a bit. These three steps are ordered from easiest to hardest, and shouldn’t all be done at once.)
-
llvm.minnum.*andllvm.maxnum.*intrinsics should be deprecated and eventually removed. During the deprecation period, they should be treated asllvm.minimumnum.*andllvm.maximumnum.*with thenszflag set, since their “doesn’t really treat -0.0 as less than +0.0” behavior is important for performance on x86. Their documentation should be updated to reflect this. The existing “intrinsics comparison” table should be changed to not mention sNaN or qNaN at all, just “NaN”, in accordance with LLVM’s semantics. -
Constrained intrinsics and
strictfpshould differentiate between signaling and quiet NaNs. The “Behavior of Floating-Point NaN values” section points you towards the constrained intrinsics, but as far as I can tell, they don’t mention signaling NaNs anywhere. In my opinion, sNaN handling should be part of the additional scope ofstrictfpand the constrained intrinsics, just like rounding mode and exceptions are now. I’m not sure if this would require any actual codebase changes. -
The
llvm.minimumnum.*andllvm.maximumnum.*operations can sometimes introduce overhead, even on platforms which do support native signed-zero handling, if they implement the older IEEE754-2008 versions that handle signaling NaNs differently (for instance, ARM). In these cases, extra “canonicalize” operations need to be inserted in order to comply with the spec. Ansnan(“no signaling NaN”) fast-math flag could potentially be added to avoid these canonicalize operations. I’m not sure how that would interact with LLVM’s existing semantics, or whether it violates the “sNaN is equivalent to qNaN” rule, so it’ll require a lot of further discussion.