Making ffp-model=fast more user friendly

I don’t know how many people use the -ffp-model=fast command line option, as opposed to -ffast-math, but I’d like to propose a change in the behavior of this option. Specifically, I’d like to make it a little more user friendly.

Currently, the -ffp-model=fast option is follows the behavior of -ffast-math, including the fact that the option implies -ffinite-math-only. There are people who have been very vocal about the fact that -ffast-math is too dangerous to ever use, and I think the -ffinite-math-only option is a big part of this.

I think most users would be better served using -funsafe-math-optimizations, but the name is a bit off-putting. Do you want options that are “fast” or options that are “unsafe”? Without any more information, you’d probably prefer “fast”, right? But the fact is that -ffast-math is much more unsafe than -funsafe-math-optimizations.

I’d like to propose that we make -ffp-model=fast a bit more user friendly, and maybe add something new (-ffp-model=aggressive) for people who want to keep the current behavior.

To provide a bit of backstory to this, my experience with this is through working with customers of Intel’s C and C++ compilers, both the one we are now calling the “Classic Compiler” (icc) and the new “oneAPI Compiler” (icx). These compilers have long supported the -fp-model option but have a distinction between -fp-model fast=1 and -fp-model fast=2 which more or less follows the proposal I’m making here. The oneAPI compiler initially followed the current clang implementation of -ffp-model=fast but we got a lot of feedback from our customers that the NaN handling was just too aggressive.

Initially, I’d like to make the following changes:

fast aggressive
Honor NaNs Yes No
Honor infinities Yes No
Complex Arithmetic promoted basic
contract fast-honor-pragmas fast-honor-pragmas

With the exception of the contract behavior, the “aggressive” column matches what we do today with -ffp-model=fast. I’m proposing changing the contract behavior to honor pragmas in both cases because I don’t understand why anyone wants to ignore pragmas.

One other change I’d like to include in the less aggressive version of this is not canonicalizing based on fast-math flags, but that would require changes to the optimizer. Right now the optimizer does things like (X / Y) * Z ==> (X * Z) / Y when fast-math is enabled, just because it might enable other optimizations later. This seems too aggressive to me, but there’s nothing we can do about it at the moment. I just wanted to note it here as another thing I’d like to change with the -ffp-model=fast option when I can.

My question here is, is anyone using the -ffp-model option who would object to the changes I’m proposing above?

It’s an interesting observation that reassociation is actually pretty safe in practice, despite the fact that the definition gives us free reign to do transforms that completely destroy precision. Do you have any insight into what makes it safe? Is it just an implicit contract between compiler developers and scientific computing users that the transforms are limited in certain ways?

If we want to encourage people to use this mode, it’s probably worth spending some time revisiting the actual transforms we perform. We currently check isFast() and UnsafeFPMath in a bunch of places, and some of them probably should be checking “reassoc” instead.

I don’t have any specific opinion on the command line flags, beyond agreeing they’re messy.

I think what I’d say about reassociation is just that it’s often safe, it’s generally well understood, and there is more that you can do about it. The most common case where reassociation causes trouble is when you have code like r = a - b + x where a and b are of much greater magnitude than x. If the compiler reinterprets that as r = a - (b - x) the x term might disappear completely, but I think people who do a lot of numeric work understand this and wouldn’t be surprised by it. Once you find the problem, you can work around it with the __arithmetic_fence built-in or using the -fprotect-parens option.

The nnan and ninf settings are a bit more hazardous, because you can write code like this:

float foo(float x, float y) {
  if (isnan(x) || isnan(y)) {
    // Do something about NaN
  }
  // Do something with x and y
}

And if you compile it with finite-math-only, the optimizer will just eliminate the NaN checks, and more generally the optimizer will treat any code that it can prove produces NaN as UB and potentially eliminate it. That’s allowed by the option, but I don’t think it’s what most people want. You can’t even always fix that by using a pragma like float_control(precise, on) to protect the NaN checks, because the value tracker can still deduce that a value isn’t NaN in an instruction without the nnan flag set if it sees that the flag is set in the instruction that produced the value. I’m sure you remember my complaints about that from this thread.

On X86 the no-nan problem is even more insidious because the ucomiss instruction, which is often used for equality comparisons, sets the ZF, PF, and CF flags if one of the operands is NaN, but when the nnan option is set in the IR, the X86 backend doesn’t bother checking PF and ZF, so it reports that NaN is equal to everything. That makes for faster code if you really don’t ever have NaN values, but it one shows up, it can lead to a nasty bug.

All of that is what you signed up for if you use -ffinite-math-only but I suspect many people don’t realize they are signing up for it when they use -ffp-model=fast and I’d guess that some don’t even know -ffinite-math-only is going to be interpreted that broadly.

BTW, you make a good point about checking reassoc when that all we need, rather than isFast() because the change I am proposing will cause isFast() to return false.