Semantics of NaN

The interesting question is whether the opposite direction is allowed. IEEE says that fneg has a well-defined sign bit even for NaN inputs; the same is not true for fsub, so I think the opposite direction is not allowed.

That sound wrong to me. A phi node is conceptually just a copy, assigning one variable to another. If phi selection can change NaN bit patterns, then the entire concept of a NaN bit pattern stops to even make conceptual sense, since the mere assignment in x = 0.0/0.0 can already change the NaN payload such that x does not have the same NaN as what the division produced.

I sure hope that assignments (including phi nodes) in LLVM are guaranteed to exactly preserve (non-poison/undef) bits. Everything else would be extremely concerning.

That sound wrong to me. A phi node is conceptually just a copy, assigning one variable to another. If phi selection can change NaN bit patterns, then the entire concept of a NaN bit pattern stops to even make conceptual sense, since the mere assignment in x = 0.0/0.0 can already change the NaN payload such that x does not have the same NaN as what the division produced.

IEEE 754 allows signaling NaNs to become quiet NaNs when “copied” and to lose their payload when copied. It does not mandate this, but allows for implementations where copying NaNs from point A to point B to lose the Signaling property and to loose the payload. For example:: it is IEEE 754 legal to architect a machine where an FPMOV instruction “does things to NaNs” that it does not do to non-NaNs. I know of no such architectures or implementations, but IEEE 754 allows for that possibility.

I worked as an architect on a GPU where the size of the FMAC unit would grow 20% if we tried to perform NaN propagation (of the payload), so we took that part out (shamefully). So, all we did was to take any situation where a NaN result was required and create a generic quiet NaN (payload == 0) for all these cases.

I do not know what semantic LLVM requires other than the copy aspect.

1 Like

Wow, that is surprising – thanks for pointing that out.

However:

  • from what you say it applies only to signaling NaNs
  • are we sure no other part of LLVM assumes that extra moves can be introduced without changing program semantics?

I think there might be some confusion here.

This:

has the exact same semantic implications as the UB/IdB in C/C++,
in the sense that the behavior of payload can be implementation-defined,
and you can not rely on it being handled one way or the other,
and the implementations are allowed to implement it any way they see fit.
In other words, this is an optimization opportunity for the implementations.

So no, moves do //not// alter the program semantics. Because you weren’t
guaranteed any particular semantics as per IEEE 754 in the first place.

Sure, my question was going in the direction of “does LLVM make use of that freedom granted by IEEE, or does LLVM guarantee that payloads are preserved on copy”.

I agree entirely with the top paragraph–it is a quality of implementation
issue–high quality implementations will strive to give the users “good
NaN properties” while lower quality implementations will do as little as
possible to achieve IEEE 754-2019 compliance.

I do not believe this is correct.

IEEE 754-2019 “5.5 Quiet-computational operations” states:
“Implementations shall provide the following homogeneous quiet-computational sign bit operations for all supported arithmetic interchange formats; they only affect the sign bit. The operations treat floating-point numbers and NaNs alike, and signal no exception.”

It seems to pretty clearly state the opposite of what you’re claiming.

I also don’t see this; AFAICT, it’s prohibited by the same section (since “copy” is considered a “quiet-computational sign bit operation”).

If you disagree, can you quote text that leads to your opposite conclusion?

[quote]jyknight

17h

I do not believe this is correct.

IEEE 754-2019 “5.5 Quiet-computational operations” states:
“Implementations shall provide the following homogeneous quiet-computational sign bit operations for all supported arithmetic interchange formats; they only affect the sign bit. The operations treat floating-point numbers and NaNs alike, and signal no exception.”

It seems to pretty clearly state the opposite of what you’re claiming.[/quote]

IEEE 754-2019 5.5 indicates copySign is allows not to copy the sign-bit on a NaN source

IEEE 754-2019 5.8 pp 3 talks about NaN conversions and raising exceptions. If the exception
is suppressed, the standard takes no position on the integer value of
a NaN or of an infinity. Thus it becomes implementation defined. In
any event, NaN payload propagation is not guaranteed.

IEEE 754-2019 5.12.1 allows for conversion of NaNs to ASCII and back without payloads;
but suggests (should) that the implementation provide a means to
convert payloads to ASCII and back.

IEEE 754-2019 6.2 pp 1 last sentence uses the word should
pp 2 indicates quieting of signaling NaNs
pp 3 indicates payload can decay when converting to and back from ASCII
pp 3 sentence 2 indicates a NaN payload can be discarded
“shall result in a canonical quiet NaN”
sentence 3 "Recognize that format conversions, including conversions
between supported formats and external representations
as character sequences, might be unable to deliver the
same NaN
allows for calculations to discard the payload (on lower quality
implementations.

IEEE 754-2019 6.2.3 Every sentence uses the word should instead of shall (excepting the last sentence that uses shall and points back at 5.5.

So, there are lots of loopholes in the [processing of NaNs.

[quote][quote=“Mitch_Alsup, post:42, topic:66729”]
IEEE 754 allows signaling NaNs to become quiet NaNs when “copied” and to lose their payload when copied.
[/quote]

I also don’t see this; AFAICT, it’s prohibited by the same section (since “copy” is considered a “quiet-computational sign bit operation”).

If you disagree, can you quote text that leads to your opposite conclusion?[/quote]

Consider a other-wise IEEE compliant implementation that performs MOV FPregister,FPregister
By performing FADD FPregister,FPregister,#0.0; This certainly moves the FPregister to another.
Does the required suppression of Signaling NaN in FADD now make the entire implementation
non-compliant ??? The above paragraphs indicates there is leeway in NaN propagation here.

That is a lot of text…but none of it actually refutes my point.

That allowance is only in the section on “non-interchange formats”. The semantics of copySign for interchange formats does require copying the sign-bit of the source, even for a NaN. (And do note the additional suggestion, “The operations for non-interchange formats should follow the specification for sign bit operations for interchange formats if the encoding permits.”)

Section 5.8 is “Details of conversions from floating-point to integer formats”. Also not relevant. Of course you cannot propagate a NaN payload from a floating-point format to an integer format: integer formats cannot represent NaNs or infinities at all.

Agree, but ASCII conversions is not really relevant to this thread.

Many of your references are actually to requirements (e.g. NaN quieting), not “loopholes”. And certainly NaN payload propagation is not required – only recommended – for most operations. Yet, it is required for the sign-bit operations. They are required to affect only the sign bit.

I assume you’re asserting that “MOV” is intended to implement “copy”, and “FADD” is intended to implement “addition”. In which case, yes: such an implementation is non-compliant. The “addition” operation is required to signal an “Invalid Operation” exception when given a signaling-NaN as input, while the “copy” operation must not raise any signals. These are not the same thing. (Of course, “non-compliant” doesn’t mean “useless”. An implementation which doesn’t support signaling NaN semantics at all is perfectly fit for use by nearly all programs.)

2 Likes

Hi all, sorry for reviving the thread a few months later, but we (Julia) are trying to work out what to do about our constant propagation guarantees in the face of LLVM’s NaN semantics (Julia-side issue here: Floating point intrinsics are not IPO :consistent · Issue #49353 · JuliaLang/julia · GitHub, but I’ll reflect any outcomes from the discussion here back there). I liked the framing of the choices above:

Did we come to a conclusion as to what model LLVM does/should implement here? I think we would strongly prefer at least the strength of your “WASM rules” option, possibly even “Weak NaN-propagation”, though I’d have to think about the exact implications. I believe that should be sufficiently strong to allow us to do constant propagation as long as we check that there are no-sources of non-preferred NaNs anywhere, which seems doable. I suppose fneg is an explicit exception here, because it is expected to flip the sign bit even on preferred NaN. Is there anything else?

Well, it seems like this more-or-less culminated in Stronger floating-point NaN guarantees / PR #66579, several months after you asked about this.

The current text is in section Behavior of Floating-Point NaN values of the LangRef.