RFC: Deprecate -Ofast

but potentially significant numerical differences for reductions.

I see the argument about unexpected performance changes, but this makes no sense whatsoever. Anyone using -ffast-math is claiming they don’t care about numerical accuracy of their results. The flag does not promise some particular non-standard outcome, but rather, says the compiler can do “whatever it wants” for certain enumerated cases. It would most-certainly be correct for the compiler to produce the same results under -ffast-math as it’s required to produce without the flag.

Anyone expecting otherwise is making a grave error in using -ffast-math in the first place.

2 Likes

but this makes no sense whatsoever. Anyone using -ffast-math is claiming they don’t care about numerical accuracy of their results

That’s too simple a view of floating point. Anyone using fast math with their code continues to do so because it’s accurate enough. If it suddenly did not validate they would probably stop using -Ofast, or more likely -Ofast -fno-whatever-option – if such an option existed to disable whatever problem they were having. At any rate, the point is, if someone has established their baseline with -Ofast results and suddenly it produces -O3 results that’s incredibly disruprting and I don’t why that is less important than someone that can’t be bothered to read the manual.

3 Likes

Based on the description in the GCC docs, I see no problem with this. Can the compiler prove that global data will never be written to by threads? No. Can the user? Yes. Why shouldn’t users be allowed to assert things they can prove to be true but which the compiler cannot?

My guess is this option is used primarily in Fortran codes, so it’s probably not relevant to Clang anyways.

Indeed, there is an entire discipline of software engineering known as Verification and Validation focused on ensuring that software produces the correct answers. It is not based on trivial unit tests but on end-to-end integration tests of complete applications.

Introduction to Verification Validation (V&V) and Uncertainty Quantification (UQ). (Conference) | OSTI.GOV provides some basic definitions. I hope folks trust the methods used there, because these people build nuclear missiles and they have very compelling motivations for the numerical accuracy of their simulations.

I work on an application, NWChem, which uses -ffast-math extensively. We have a very large number of integration tests in our QA suite, in addition to the integration testing the developers do privately when introducing changes to numerically sensitive portions of the code. Everything I added to NWChem was consistent with external reference values to 13 or 14 decimal digits.

To say that all users of -ffast-math don’t care about numerical accuracy is as insulting as it is ignorant.

3 Likes

Indeed.

We may have initially matched really close to GCC in the past (dragonegg era) to be able to be a drop-in replacement, but that era has long gone. Today entire systems (including kernels) compile with clang on its own, not as a GCC replacement.

We should not shoot ourselves in the foot just because GCC does. That has been a design decision since the beginning in Clang (ex. -heinous-gcc-extensions et al). And that has made Linux (kernel & system), FreeBSD and other very large projects better because of it.

However, IIUC, -Ofast is a GCC mnemonic. If we are emulating some user interface from GCC, then we should match the behaviour. In the same case, if we emulate MSVC command-line, we need to match semantics.

We can, however, emit warnings on doing so. We can disable those warnings with something very obvious like -wno-gcc-footgun or something.

We have explicitly selected out any GCC footgun in the past and we should continue doing that. Any passes that are known to be incorrect or destroy semantics beyond local scope should not be in a -O option, and should need to be added explicitly.

-ffast-math is more about selectively bypassing IEEE/ISO standard restrictions than it is to not care about precision.

On the contrary, you care more about precision than standard restrictions, and that’s why you disable them. This involves reordering, fusion, rounding, denormals, signals all of which are standard specific (extra cautious safety concerns, etc) and not related to the maths they’re trying to represent.

That’s why -Ofast exists, to create “faster” non-conforming programs, which in itself, it’s a valid proposition, and it’s documented, so becomes “user error”.

My argument above was just what kind of thing we want to put into -Ofast from our side, and the answer has always intentionally been: “what we feel comfortable with”, not “whatever GCC does”.

Or, if we disable the flag altogether, that’s also a clear message, which has its own merits. But as seen in this thread, there are some very valid uses of -Ofast, so perhaps not the most user-friendly alternative.

I would continue to have -Ofast with the options we’re comfortable with, and let the community decide how to handle the foot guns in separate.

How about some middle ground, retain (most) of the behaviours important for performance, drop (most) of the $dangerous ones. For example
-Ofast could just mean -O3 -fno-math-errno -fassociative-math -ffinite-math.

3 Likes

That the flag exists is okay, as you say, though I wouldn’t say it’s worth having. Making -Ofast imply it – and thus be incompatible with multi-threaded code – seems unexpected and a bad idea.

Let me clarify my position: I don’t mean to impute anything about what the user cares about – only what they’re informing the compiler they care about.

-ffast-math is a flag to tell the compiler not to care about numerical accuracy or IEEE conformance. The contract with the compiler is “I don’t care: give me better performance”. The compiler is thus permitted to do result-changing modifications to the code like reassociation, conversion of division to multiplication by approximate reciprocal, and replacement of math functions with approximations. It does not require those to be done, nor take into account numerical accuracy loss – or gain – in the decision. It should never be considered a compiler correctness bug if it e.g. stops reassociating some particular expression in a new release, for example.

Of course a user may use this option but still care about particular numerical results, e.g. by validating the resulting program’s correctness with an end-to-end test suite. Yet, that is inherently fragile in the face of compiler upgrades, and especially so if your program would produce incorrect results when built in a standards-conforming mode (as the previous post was suggesting was the case for some software)!

1 Like

-ffast-math is the user’s way of telling the compiler that they know that their code is stable w.r.t. the non-IEEE transformations it enables, and some users actually read documentation to know what these are. I don’t know where Clang lists what it does but I’ve looked at FloatingPointMath - GCC Wiki and none of the relaxed transformations bother me. Floating-point isn’t exact and very few of the codes I work on were designed by mathematicians to be accurate to 1ulp anyways. If a code is computes the Boys function with Chebyshev quadrature to 5ulps, does it make a difference where IEEE invsqrt is used or not? No, it doesn’t.

Particularly in the context of parallel execution, which is often non-deterministic and frequently implies reassociation of sums and products, the guarantees that IEEE764 provides are overrated. Just because the arithmetic primitives are accurate does not mean the code as a whole is.

Clang has -O3 already. I fail to see any compelling reason to alias -Ofast to it when there’s at least 20 years of inertia behind using -Ofast. People that think -Ofast and -ffast-math are evil are encouraged to write papers and give technical talks informing the HPC community about this if they believe a change in user behavior is required.

2 Likes

This is one of the interpretations, but not the only one.

In HPC, people really care about precision and that’s why “fast math” became a list of different deviations from the standards (IEEE/ISO) that are “known to not violate the expectations of the underlying maths (or do so in a precise way)”.

IEEE talks about digital precision. For example, mul + add may not have the same binary answer as mla, so IEEE assumes precision is lost. But it’s often gained.

ISO C/C++ have to assume the program is the source of truth. Compilers are not allowed to change the semantics of what the programmer wrote, but in reality, problems have to be translated to programs, and often the original (maths) semantics is lost, just like when we convert from C++ to IR, to asm. Here, “fast math” can mean: “I don’t care about C++ FP semantics, but I do care about the original equation”.

In the end, -ffast-math is usually only equals to “don’t care” in non-scientific code, but it’s still very useful in scientific code, due to its often precise semantics of what to bypass/ignore.

3 Likes

I support @jyknight proposal.

Flags that enable non-conforming behavior that either lead to surprising behavior or bugs should be opt-in explicitly.
Given the concerns about changing existing code, I think a depreciation would solve some of the objections I’m reading in the thread.

It should not be that easy to opt-in to foot guns. “-Ofast you say? Sure, I do want my code to be fast, much obliged.”
Part of the problem is that not all users will read the manual. Alas the manual offers no disclaimer
(see Clang command line argument reference — Clang 19.0.0git documentation and Document optimization flags · Issue #20917 · llvm/llvm-project · GitHub).

The out-of-box behavior of Clang should be that of a confirming compiler (especially as far as errno and nan handling) and while having the ability to trade conformance for performance is nice, the user can do that for themselves with something a bit more explicit than just “optimisation: fast!”. I am fairly confident that many users are unaware of the trade offs they made.

The burden to explicitly ask people to replace -Ofast by -O3 -ffast-math is worth if it gives users the opportunity to review
the set of flags enabling non-conforming or dangerous behavior they use.

It would be a good opportunity to document some of these flags and their pitfalls to ease people off -Ofast.

2 Likes

Why stop there? Let’s deprecate -ffast-math as well and the user can really decide what non-conforming / dangerous options they’re ok with.

That’s obviously a rhetorical question. The answer is “convenience”. The umbrella flags let the industry come to a consensus on what boundaries can be ignored in the pursuit of peak performance. GCC does this with -Ofast more so than Clang, but there is still potential value there for Clang.

You mentioned that Clang doesn’t document -Ofast. As far as I’m concerned, any complaints about users not understanding a dangerous option are moot while there’s no documentation. Of course they got hurt. We didn’t warn them that it was dangerous. Using the lack of documentation (a community failure) as the motivation to subvert a decade+ of industry exposure is weak.

3 Likes

I vote for less surprises.

  • deprecate -Ofast, give some warning.
  • document -Ofast
  • remove -Ofast

Why should -std=c++23 imply -opt-for-data-races?

Do we even have documentation for our options? Or do we still point to the GCC one?

1 Like

One thing not mentioned during this whole discussion is ICC had -O4 as equivalent as -Ofast before GCC added -Ofast.

1 Like

I don’t know about historically, but at least since ~2016(but likely much longer), ICC turned on -ffast-math and -O2 by default (even with no opt setting!), and I believe ICX followed suit.

I think the arguments about this option being badly named have some validity, but I also think that ship has sailed. I have the same problem with -ffast-math and -funsafe-math-optimizations. Do I want my code to be fast? Of course! Do I want it to be unsafe? No way! So I choose -ffast-math, not knowing that it is actually less safe than -funsafe-math-optimizations. Yet that’s basically just something we’re stuck with.

I am not in favor of taking something away from users who do know what it does just because there might be people using it who don’t know what it does. That line of reasoning would eliminate most C++ features!

I support the suggestion that @chill made – instead of aliasing to -O3, alias to a set of options that is likely to be what -Ofast users want, removing the really hazardous stuff. I would suggest a different set of behaviors though. Here I’ll reference my recent proposal about making -ffp-model=fast more user friendly: Making ffp-model=fast more user friendly I think most people weren’t interested in that because few people are using -ffp-model, but the core of my argument is the same as what we’re discussing here.

The heart of it for me is that LLVM is now optimizing so aggressively based on the nnan and nninf flags and the related nofpclass attributes that they have become unusable except in very limited cases. This behavior is strictly consistent with what the finite-math-only option says, but it’s extremely aggressive.

This comes into play with regard to SPEC. There is a benchmark in the CPU2017 suite that uses infinities as a guard value. Until recently, you could compile this benchmark with clang -Ofast and it would pass because we weren’t optimizing away all the comparisons with infinity. Then we improved our nninf handling, and the benchmark started failing with -Ofast.

I agree with those who have said that just because -Ofast says we can violate IEEE-754 rules doesn’t mean we have to everywhere. However, I do think that we should at least do something that is consistent with the basic intention of the option.

My suggestion is -Ofast → -O3 -ffast-math -fno-finite-math-only -fcomplex-arithmetic=promoted -ffp-contract=fast-honor-pragmas

4 Likes

Here’s the discussion of the failure I mentioned with CPU2017: Fast math SPEC 2017 FP failure for povray

No thanks!

The C standard talks about the observable behaviour. The next Clang would give different results for -Ofast than all previous versions?

I still claim that the only option is to remove -Ofast.

1 Like

You already get that with -Ofast. The discussion I linked above about povray is exactly a case where the observable behavior using -Ofast changed between releases. That’s OK because -Ofast explicitly allows deviation from the standard (or at least it would be explicit if we documented it properly).

The possibility of numeric results changing from release to release is a fundamental risk you take when you use any of the options that enable fast-math behavior. There is no way around it. People who don’t understand that absolutely should not be using these options. But the options are too useful to just take away completely.

I just don’t see the point of deprecating -Ofast while we still have other options that do the same thing. That said, I would say that deprecating -Ofast is a better “solution” than making it alias to -O3. At least then people using the option would know that it was being taken away.

3 Likes