Currently in IR, we do nothing for fadd/fsub/fmul. For fdiv/frem, we propagate undef. The code comment for fdiv/frem says:
“the undef could be a snan”
If that’s correct, then shouldn’t it be the same for fadd/fsub/fmul? But this can’t be correct because we support targets that don’t raise exceptions…and even targets that raise exceptions do not trap by default on snan?
I’m not sure the transformation happening with fdiv is correct. If we have “%y = fdiv float %x, undef” and %x is a NaN then the result will be NaN for any value of the undef, right? So if I understand the undef rules correctly (never a certainty) then we can’t safely replace the expression with undef. We could, I think, replace it with “%y = %x” though. I think the same is true for fadd, fsub, fmul, and frem.
What I’m saying is that if we have one operand that is not an undef value then that operand might be NaN and if it is then the result must be NaN. So while it may be true that we don’t have a NaN, it is not true that we definitely do not have a NaN in the example. This is analogous to the example in the language reference where it says “%A = or %X, undef” → “%A = undef” is unsafe because any bits that are set in %A must be set in the result. If any floating point operand is NaN dynamically, then the result must be NaN.
I don’t believe it’s accurate to say that NaN is “morally equivalent” to undef. There are some similarities, but the important difference is that NaN always has well defined behavior with a specific correct result. There is, perhaps, a sense in which it is analogous to a poison value, but for the purposes of reasoning about the correctness of floating point operations I think it’s best to be pedantic about treating it as the specific value that it is.
Finally, I was pretty sure you knew that fdiv by zero wasn’t undefined. I just wanted to clarify that the “?” in your comment was indicating that the assertion in the language reference was questionable as opposed to this point being in any way actually uncertain.
Ah, thanks for explaining. So given that any of these ops will return NaN with a NaN operand, let’s choose the undef operand value to be NaN. That means we can fold all of these to a NaN constant in the general case.
But if we have ‘nnan’ FMF, then we can fold harder to undef?
nnan - Allow optimizations to assume the arguments and result are not NaN. Such optimizations are required to retain defined behavior over NaNs, but the value of the result is undefined.
Fair warning, the following is a tangent from the actual topic under discussion... It's only vaguely connected to the example at hand and is mostly a reaction to a collection of comments in various previous floating point discussion threads.
The reasoning here worries me. The semantics of LLVM IR is specified independently of what any particular target does. Referring to target behavior to decide on appropriate semantics for a particular optimization seems inherently problematic. We should instead use LLVM IR's stated semantics to decide legality. We should only use target semantics to help shape proposals to change the IR semantics.
To be specific, something along the lines of the following seems entirely reasonable: "LLVM IR allows us to perform the following optimization here, but doing that would radically complicate the lowering on target X which doesn't support Y. Should we change the IR specification to Z?" On the other hand, reasoning like "Target X doesn't do Y, so this optimization can't be legal" are problematic.
If the code is more specific than the LangRef, we should propose a clarification to the LangRef. The specification for floating point semantics is admittedly a weakness in the current version. Rather than working around it, we should fix that.
I'm pretty sure that isn't what nnan is supposed to mean. If the result of nnan math were undefined in the sense of "undef", programs using nnan could have undefined behavior if the result is used in certain ways which would not be undefined for any actual float value (e.g. converting the result to a string), which seems like a surprising result. And I don't think we gain any useful optimization power from saying we can fold to undef instead of something else.
So I think it's supposed to say "the result is not specified" or something (so an nnan operation which would produce a nan can instead produce any value that isn't undef/poison).
For the first part of Sanjay’s question, I think the answer is, “Yes, we can fold all of these to NaN in the general case.” For the second part, which the nnan FMF is present, I’m not sure. The particulars of the semantics of nnan are unclear to me.
But let me explore what Eli is saying. It sounds reasonable, but I have a question about it.
Suppose we have the nnan FMF set, and we encounter this:
%y = fdiv float %x, undef
If I’ve understood Eli’s interpretation correctly, any of the following transformations would be legal and safe:
%y = 0.0
%y = -1.0
%y = inf
%y = NaN
And so on, covering all possible concrete values, right? Now suppose I don’t change it at all. Now I might have IR that looks like this.
%y = fdiv float %x, undef
%z = fmul float %q, %y
At this point, while working on %z, I could reconsider and say “If I had transformed the fdiv into ‘%y = 0.0’ I could optimize this fmul away too.” So at that point I can choose to do that, right? And in general I can “retroactively” choose any concrete value that would be convenient for the next transformation. You see where I’m going with this?
How is that different from just folding the fdiv into undef to begin with? Is it because I can’t choose different values on different code paths?
Agreed. Those IR instructions are undefined on SNAN, and that undef could take on an SNAN value. Folding these instructions to undef seems reasonable, and it is arguable that you could even fold it to an ‘unreachable'.
So you don’t think sNaNs can just be treated as if they were qNaNs? I understand why we would want to ignore the signaling part of things, but the rules for operating on NaNs are pretty clear and reasonable to implement. The signaling aspect can, I think, be safely ignored when we are in the mode of assuming the default FP environment.
As for the distinction between IEEE and LLVM IR, I would think we would want to define LLVM IR in such a way that it is possible to create and IEEE-compliant compiler. I know we’re not there yet, but we’re working toward it.
Wait, back up, what? The invalid flag raised by operations on sNaN is no different from any other flag in fenv. There’s nothing sensible about saying that every operation that raises underflow/overflow/inexact is undefined, why are operations on sNaN any different?