how to simplify FP ops with an undef operand?

Other than finding someone to volunteer for the work required, is there a reason not to add a NaN the IR? I can already ask a ConstantFP if it is a NaN. Why not make that easier to represent?

ConstantFP *is* the way to represent a nan constant. What is hard about it?

-Chris

So you don’t think sNaNs can just be treated as if they were qNaNs? I understand why we would want to ignore the signaling part of things, but the rules for operating on NaNs are pretty clear and reasonable to implement. The signaling aspect can, I think, be safely ignored when we are in the mode of assuming the default FP environment.

As for the distinction between IEEE and LLVM IR, I would think we would want to define LLVM IR in such a way that it is possible to create and IEEE-compliant compiler. I know we’re not there yet, but we’re working toward it.

I just posted a long response on the thread, but it is important to know that these LLVM IR instructions are not defined and do not respect rounding mode or IEEE trapping flags. Only the intrinsics do.

-Chris

Whether operations on sNaNs trap in the “default execution environment”, or otherwise interrupt normal control flow or have side effects, seems to be the key point of disagreement here. I don’t believe they do, at least as far as my amateur reading of IEEE 754-2008 can tell:

  1. (most) operations on sNaN signal an invalid operation exception (§7.2), and so do many other operations on other values (also §7.2), such as: 0 * inf, inf / inf, fma(0, inf, x), sqrt on negative inputs, converting a float to an integer when the source is NaN/is infinity/does not fit in the destination type, etc.
  2. IEEE specifies a default way of handling exceptions (§7.1), which for invalid operation is returning a quiet NaN (§7.2).
  3. Language standards should offer a way to override the default exception handling (§8.1).
  4. Immediate alternate exception handling (§8.3) can be implemented via traps (§8.3, NOTE 2).

As I said I’m not an expert on this standard, but it seems very clear-cut to me that IEEE specifies operations like divide(x, sNaN) should return a quiet NaN, nothing else, unless the program uses language-provided facilities to install some other behavior. In this respect sNaN operations are not any different from other invalid, inexact, overflowing, etc. operations (as Steve already said).

If this is the case, there is no reason to treat e.g. “fdiv %x, snan” as having side effects or some sort of UB: fdiv and friends already assume a “default” fenv where nobody looks at flags, changes rounding modes, installs alternative exception handling, etc. so the invalid operation exception from sNaN operands is just as irrelevant as all the other exceptions are. LLVM can simply assume the default exception handling (as it already does in many cases) and fold calculations on signaling NaNs to quiet NaNs if it so wishes.

I have not surveyed the numerous hardware implementations (and everything else that goes into the “default execution environment”, e.g., what the OS does), so it might be that some of those default to trapping on sNaNs. I’ve never heard of such a thing, and just verified that it does not happen on my x86_64 machine, but there’s a lot of weirdness out there. If you know of any targets that trap on sNaN by default, please tell us. Otherwise, going only by IEEE (as you yourself did), I don’t see how traps could be a possibility without the program opting into fenv access (in which case the frontend has to emit constrained intrinsics anyway).

Cheers,
Robin

Thanks for expanding, Chris. Responses inline.

  • Because LLVM reorders and speculates the instruction forms, and because IEEE defines the corresponding IEEE operations as trapping on SNaNs, it is clear that SNaNs are outside of the domain of these LLVM operations. Either speculation is ok or trapping on SNaN is ok, pick one… (and we already did :slight_smile:

I see the source of confusion now.

IEEE does not define any operations as trapping on sNaN. It defines operations as raising the invalid flag on sNaN, which is not a trap under default exception handling. It is exactly the same as raising the underflow, overflow, inexact, or division-by-zero flag.

Any llvm instruction necessarily assumes default exception handling—otherwise, we would be using the constrained intrinsics instead. So there’s no reason for sNaN inputs to ever be undef with the llvm instructions. They are just NaNs.

– Steve

To further clarify: IEEE 754 calls the process of signaling “raising an exception” and “exception handling", but this is not what anyone else means by “exception”. Under “default exception handling”—what the llvm instructions assume—it is just setting a sticky bit in a status register that you cannot even read under the assumptions of the llvm instructions, hence operations on sNaN are side-effect free in the LLVM instruction model, just like any other input.

This bit of confusion comes up regularly. It would be really good to get this documented in either the langref or something linked from it.

Philip

Well, using ConstantFP to represent NaNs is at least unclear enough that someone claimed that there is no NaN constant in LLVM IR. :slight_smile:

But your point is well made. I guess what I was really asking for is a 'NaN' token in the asm writer/parser. This is obviously a much less important issue, but I can't say that I would look at 0x7f800000 in an ll file and immediately think, "Oh, that's a NaN." I'm not a fan of hexadecimal representation of floating point numbers in general, but I guess it's a necessary evil for exact conversion back to bitcode. The 'NaN' I'm asking for, of course, has something like 4 million possible representations, but I think they are all equivalent.

In short, never mind.

One issue I’ve always had is the nonsensical use of double FP constants even for float values. I can recognize the special FP values if they appear in their natural 4-byte format, but when awkwardly converted to double it’s more difficult, especially when I’ve written tests checking for the correct handling of payload bits.

-Matt

I agree with all of what Steve said (explicitly including the “Thanks for expanding, Chris” part of it).

In fact, I found myself wanting to nitpick the part about fdiv/frem trapping on divide by zero, because unless the programmer has done something to unmask that exception it will just silently set the corresponding status bit.

This is what I meant in my suggestion that we just treat sNaNs as if they were qNaNs. We can promise to provide the correct floating-point handling of operations with regard to the NaN aspect and just not make any promises regarding the correct preservation of the status bit or exception-safe scheduling of the instruction. In this regard, I don’t think an sNaN setting the INVALID flag is inherently any more dangerous that than an fptrunc instruction setting the INEXACT flag.

-Andy

Ah yes, I completely misunderstood that! Thank you for clarifying. In that case, it seems perfectly reasonable for “fadd undef, 1” to fold to undef, right?

-Chris

Yes, that would be sensible. I’d suggest just adding a NaN immediate for type-specific we almost always by default (0x7f800000 for floats). The rest can use hex notation.

-Chris

Yes, indeed.

Great! Can someone please update LangRef so we codify this for the next time I forget? :slight_smile:

-Chris

Sure, I’ll post clean-ups for LangRef as the first step.

Make sure everyone’s on the same page now: the general rule will be that { fadd, fsub, fmul, fdiv, frem } undef simplification and constant folding will follow IEEE-754 unless stated otherwise. So for fadd:

  1. fadd %x, undef → NaN

If the variable operand %x is NaN, the result must be NaN.

  1. fadd undef, undef → undef

Anything is possible.

  1. fadd C, undef → undef (where C is not NaN or Inf)

In the general constant case, the result could be anything as long as constant operand C is not NaN or Inf.

  1. fadd NaN, undef → NaN

Same reasoning as #1; NaN propagates.

  1. fadd +/-Inf, undef → NaN

If the constant operand is +Inf or -Inf, then the result can only be +Inf or -Inf unless the undef is NaN or the opposite Inf. If the undef is NaN or opposite Inf, the result is NaN, so we choose undef as NaN and propagate NaN. (If some program or known-bits is tracking that the exponent bits are all set, we’ll preserve that…)

See IEEE-754 section 7.2 for more rules.

3. fadd C, undef --> undef (where C is not NaN or Inf)
In the general constant case, the result could be anything as long as constant operand C is not NaN or Inf.

If C is the largest finite positive number, then (fadd C, X) cannot be a finite negative number. So doesn't folding (fadd C, undef) --> undef break the rules?

Cheers,
Nicolai

Hi,

*Hopefully uncontroversial points:*

- Floating point operations are represented in LLVM IR in two ways: the
fdiv/fmul/fadd etc /instructions/, and the llvm.experimental.constrained.*
/intrinsic/ forms.

- The instruction forms are modeled as having no side effects. fdiv/frem trap
on divide by zero, but are otherwise defined on the same set of inputs as
fadd/fmul/etc.

- Because they have no side effects, these instructions can be reordered freely
(though for fdiv/frem, see footnote [1] below). For example, it is legal to
transform this:

foo(x,y)
tmp = a+b

into:

tmp = a+b
foo(x,y)

This can occur for many reasons: for example, because the compiler decides it is
profitable (e.g. hoisting a loop invariant computation out of a loop), as a side
effect of instruction scheduling, selection dag not having chain nodes on the
ISD nodes, etc.

[snip]

- Because the LLVM instructions are not defined on SNaNs, SNaNs are outside of
their domain, and thus the LLVM instructions are undefined on these inputs. As
such, it would be perfectly reasonable to “constant fold” an "fadd SNaN, 42”
instruction into unreachable and delete all the code after it, or turn it into a
call to formatHardDrive(). [2]

Isn't "possibly raises UB" in contradiction with "does not have side-effects"?
In your reordering example quoted below, if `foo` never returns but `a+b` raises
UB, then doing the reordering could introduce UB into the program. Returning
undef or poison should be fine, but raising UB or calling formatHardDrive()
seems to be incompatible with desired optimizations. Did I miss something?

Kind regards,
Ralf

I need to review the last thread about undef/poison (or someone who knows current status of that can reply), but this would seem to come down to whether undef applies to the entire value or the individual bits?

Ie, in your example the sign bit will never be set unless all of the exponent bits are also set. Each bit individually is unknown, but taken together we know that some sequences are impossible.

I need to review the last thread about undef/poison (or someone who knows current status of that can reply), but this would seem to come down to whether undef applies to the entire value or the individual bits?

Ie, in your example the sign bit will never be set unless all of the exponent bits are also set. Each bit individually is unknown, but taken together we know that some sequences are impossible.

The “on the ground” reason we made undef be a bit-level concept for integers was specific problems with C bitfields: when initializing a mem2reg’d bitfield, you end up doing or’s into undef values, and those bits have to be defined.

I’m not aware of a similar concept that makes sense for fp values. We could choose to do fine grain tracking (e.g. so ldexp and friends would work to set the exponent??) but I don’t see any practical reason for doing so.

-Chris