RFC: Deprecate -Ofast

jcranmer · May 2, 2024, 9:04pm

The C standard talks about the observable behaviour. The next Clang
would give different results for -Ofast than all previous versions?

The set of non-value-preserving-optimizations that -ffast-math enables
is in constant flux. I started building a list of such optimizations a
couple of months ago, and it’s already out of date, especially as we’ve
been working on doing a lot more optimizations around trigonometric
identities and the like. And this isn’t just restricted to -ffast-math
as a whole, even individual flags within fast-math have had a shifting
definition in part because they’re so poorly defined (I put up an RFC
earlier today about clarifying these semantics.)

While it’s on my mind, I should also point out that the optimizations we
enable under -ffast-math aren’t the same set of optimizations gcc
enables (neither is a subset of the other, even), and ditto for
equivalent flags on all the other compilers.

sjoerdmeijer · May 3, 2024, 8:46am

Ignoring if this is technically the right thing to do or not, I am in the camp that changing the behaviour of an existing option will guarantee us many years of confusion and user questions “hey, why is Clang slower/different than GCC with -Ofast?”. Therefore I am not sure changing the behaviour of -Ofast will be an improvement, I think we will trade in one problem for another.

I also think that option compatibility with GCC should not be underestimated. Clang can’t be held hostage by GCC behaviour, but option compatibility is very valuable and something we should always aim for. For example, the confusion between is GCC’s -Os (minimum size) and Clang’s -Os (compromise of speed and size, -Oz is minimum size) is just never ending.

Our very poor documentation is probably not helping either, but I am guessing the reason is that many toolchains based on Clang have their own documentation.

jeffhammond · May 3, 2024, 11:08am

Because the user also added -Ofast rather than -O3?

Are you suggesting that -std= must have fully ISO C++ standard conforming behavior no matter what other flags are added? How about when the Clang options -fopenmp, -ObjC++, -fsycl, -fno-exceptions or things that turn on CUDA and HIP support are used?

Should we remove support for all flags that are not strictly compliant with ISO C++? Why give users any freedom at all? Perhaps some people in this thread should just decide the only acceptable flags for Clang just hard-code those, and remove all the others?

I truly do not understand the level of antipathy and distrust towards users in this thread. Some Clang developers apparently think we are all fools who use Clang as a random number generator and don’t know to read.

tschuett · May 3, 2024, 11:19am

I wanted to hint at the surprise that -Ofast was silently changed to an alias for -O3 without fast math and without telling users.

I still claim deprecating and removing -Ofast is a saver approach.

rengolin · May 3, 2024, 11:41am

You’re reading too much into the answers in this thread and taking an aggressive stance that is not helpful.

You felt insulted when people expressed ideas different than your own and hinted at ignorance. Now you’re claiming antipathy and distrust, and inputting behaviours that I’m pretty sure the authors did not intent (I know most of them throughout the years).

It would be helpful if you try to understand where these arguments are coming from, ie. developing a compiler that has its own internal design but having to map the public API of some other compiler that, despite multiple attempts (some from my part) did not reciprocate collaboration in the slightest on said APIs.

The biggest flaw here is that for too long we’ve been silently pretending to behave in that way, doing our best to match some expectations, but not wanting to incur in the same design flaws and broken behaviour.

I have personally been through this battle with Android, the Linux Kernel, some Linux distros, Chromium and the stance that most people are demonstrating in this thread has made those projects better by fixing their code instead of breaking yet another compiler. Too many developers in those projects have thanked Clang developers for standing behind their design decisions and behind the IEEE/ISO standards. That was, and still is, a commendable effort.

But as I think you know, every case is different, and fast-math means different things to different people. Claiming your interpretation is the right one is short-sighted.

That is why I was alluding to documentation. We don’t have documentation like GCC on what our passes mean and how they compose because we’ve been piggy-backing on their API for too long. In this case, GCC isn’t wrong to not collaborate with us on their API. But coming up with a different API is also problematic.

So my proposal is simple: stick to GCC docs for the things we do equal, write our own docs for the things we do different. We should also strive to have a clear explanation why we took those choices.

I also encourage non-clang folks like yourself to write up how you use the API and why it matters, so that the design can be reasonable, not shoe-horned.

This will invariably means people will have to change their make files depending on the compiler, but honestly, this is probably the best thing to do anyway.

jeffhammond · May 3, 2024, 11:52am

No, I am not insulted by any of this. My issue is with how generic Clang users are perceived. I dislike the repeated false assertions that -ffast-math discards all notions of accuracy and the general ignorance of how numerical computation works. Accuracy is much deeper than whether primitive arithmetic behaves a certain way.

The antipathy towards users is encapsulated in comments like yours that it’s actually the best thing to break downstream user experience because the developers of Clang don’t document what -Ofast does well enough for users to understand that it implies -ffast-math and therefore isn’t strictly IEEE754 conforming.

erichkeane · May 3, 2024, 1:02pm

The concern here is that currently, -Ofast has already resulted in many years of confusion and user questions, “hey, I just wanted my program to be fast, but now you took away my NaNs, what gives?”

While I am sympathetic to the folks that use -Ofast intentionally, I believe we have to balance the needs of BOTH less experienced and more experienced devs.

In this case, I think we should bias towards the less experienced devs. They are less likely to read documentation or understand what flags do, and end up using -Ofast not realizing what it does. Mixed with the fact that -Ofast doesn’t have nearly the reputation that -ffast-math does: which, despite its awful name, is well known amongst developers as “might break my math”.

For HPC/experienced devs: We’ve already acknowledged that they read the documentation and have a great level of expertise as to what -ffast-math does. I think we should do some level of deprecate-and-document (plus perhaps diagnose) the flag. The HPC/other experienced devs can be trusted to read the documentation and understand the change, and makefiles are changed quite easily with sed in this case.

mcinally · May 3, 2024, 1:12pm

Again, your argument about users not reading/understanding the documentation is bogus when there is no documentation.

How do you know that documenting -Ofast wouldn’t fix the problems you claim? Have we tried it?

erichkeane · May 3, 2024, 1:23pm

GCC has had it documented for decades, and we took it from them. Documenting it is clearly not enough.

mcinally · May 3, 2024, 1:26pm

Subjective.

Clang is not GCC. I don’t read CocaCola’s nutrition facts when drinking a Pepsi.

erichkeane · May 3, 2024, 1:32pm

That is a disingenuous argument. Its well established that Clang has historically emulated GCC with flags/behaviors in an attempt to be compatible. This is more a Coke vs Diet Coke situation.

That said, as further evidence: I have a decade of experience copy/pasting documentation in response to bug reports/questions in IRC/discord/etc that shows we get a ton of newbies who don’t know enough to read documentation.

I suspect believe that this is the behavior of most new devs. Additionally, new users rarely understand the consequences, even after reading documentation.

Right now, we put an undue burden on new devs. The RFC here suggests putting a very mild burden on a subset of experienced devs. I think that to be a good trade.

rscottmanley · May 3, 2024, 2:16pm

If you are working on a program which is sensitive to floating point optimization or behaviour of NaN/Inf/Subnormals, you are almost certainly a professional. Is it really such a large ask to require professionals to actually understand a tool they must use?

You’re saying that users will never read the manual, but in that case why would users that want maximum performance be savvy enough to add -ffast-math? Surely -Ofast is a better user experience from their POV.

All this change would do is shift work around from people that don’t know what they are doing to people that know what they are doing. If -Ofast didn’t already exist I’d be on board with all the arguments presented in not adding it, but as many have pointed out it does and not only that it exists in compilers beyond gcc and clang. Clang would become the outlier.

I am mostly hoping the longer this thread goes the sooner this gets picked up by reddit/social media. Then users will know what -Ofast does, making the RFC moot.

erichkeane · May 3, 2024, 3:13pm

Our experience as a compiler has shown that isn’t the case unfortunately. Also note that MANY clang users are NOT professionals, and are just other open source devs, hobbiests, etc, that are ALSO surprised by this behavior. Anyone ‘professional enough’ to understand a tool they must use should ALSO be ‘professional enough’ to read our release notes/warnings/whatever we do here to cope with this trivially.

There are two types of users, those who see ‘-Ofast’ and say, “ooh, I like fast, lets do that!”. Then there are those who read the documentation and understand what is going on before using a flag. The proposal here is to protect the former group, while only mildly-inconveniencing the latter group.

That seems consistent with the guidance and direction of the project: we are aiming to, within the standards, make the sharp corners not quite as sharp, and this is a particularly sharp corner.

It moves a LOT of work from people who don’t know what they are doing to a SMALL amount of work for those who do. In less time than it took them to read our release note, they could do the sed command and be fine with it.

rscottmanley · May 3, 2024, 5:09pm

There’s a thread discussing this RFC on Phoronix Proposal Raised To Deprecate "-Ofast" For The LLVM/Clang Compiler - Phoronix Forums It’s perhaps a slightly better proxy for users than this thread is. At minimum, there is simply no consensus there on this option which argues for the status quo until a more acceptable proposal is offered.

The proponents of this RFC keep coming back to what’s the “best for the user”. Does -O3 include best possible debugging information if we’re gearing things for “new users” rather than experienced ones? Is the best new user experience, -O3 or -O0? What about for embedded devs – is -Os a better default? How is Clang weighing which users should be prioritized? Implicitly, the best new developer experience is default (no) options. That’s -O0. Clang requires everyone else to opt in for -O2 or -O3. Clang requires everyone to opt in to -march/-mcpu instead of using host target as default (which isn’t even possible!) and yet how many bug reports come from “why isn’t my code generating (ISA) instructions”? If we’re saying users need to opt in to these things as they become more experienced then why is that same standard not applicable to -Ofast?

What does best mean and for who? Is there a definition documenting this mandate? Maybe that should be answered as a first step so Clang can canonicalize the behaviour of all major options.

JonChesterfield · May 5, 2024, 9:29pm

allow-store-data-races
E.g. 97309 – Improve documentation of -fallow-store-data-races

That’s exciting!

If we deprecate Ofast, or alias it to O3 which I agree is essentially the same as deprecating it, we give up the existing syntax for “faster program with different semantics please”.

That’s especially tempting for this audience as the proper stance is that changing semantics is miscompilation, not optimisation.

However an alternative perspective is “my program is useful but takes too long, give me one which works kind of similarly but runs faster”, and that is a totally legitimate use case which is presently somewhat served by Ofast.

A domain-specific thing is something in games dev where the aggregate behaviour is still entertaining and you don’t care hugely how rounding happens (or perhaps how data races end up ordered).

A language-specific thing in this area is openmp reading environment variables to branch on inside GPU kernels. Standards compliant, convenient for people toggling mutable globals while the program runs, not desperately clever from an optimisation perspective. So an openmp implementation with an option for “environment variables controlling codegen are ignored” should probably bundle that up with Ofast and definitely not with O3.

I’d prefer -Ofast-and-wrong as the name but given the prior art and tendency to compile things with gcc and clang, I’d agree that ship has sailed. Users get to learn that Ofast changes semantics, either from docs or from accumulating scars.

I think we should keep Ofast as the name for what it currently is and accept that some users want the compiler to change the semantics of their program. Over time, maybe it should gain more break-my-stuff-and-I-like-it style transforms. Assuming int pointer arguments to functions don’t alias each other probably makes code faster and won’t break all of it, stuff like that.

Fun discussion either way, thanks for bringing it up.

arsenm · May 6, 2024, 8:58am

-Ofast was only added in 2012

erichkeane · May 6, 2024, 2:21pm

2020 was 15 years long.

andykaylor · May 6, 2024, 4:46pm

I was looking through the driver code for something mostly unrelated that I’m working on, and I happened to notice that -Ofast also currently enables strict aliasing by default, though that may only impact people using clang on Windows in CL-mode, since strict aliasing is on by default elsewhere.

AaronBallman · May 7, 2024, 2:53pm

Thank you for raising this RFC and thanks to others for discussing different perspectives on it!

My personal opinion (not code owner decision) is that we should deprecate and remove -Ofast. I don’t think we should alias it to -O3 because the effects are too surprising, but a deprecation warning and eventual removal helps reduce the surprise significantly, especially because removal becomes a hard error where the code no longer compiles (as opposed to automatically behaving like -O3). Because there is plenty of evidence of the optimization flag being used in the wild and the difficulties that sometimes arise from maintaining a build system, I think we should have a fairly long deprecation period (perhaps four releases, roughly two years) before removing the option, but I’d be curious to hear what @jyknight has in mind for this.

-ffast-math is a language dialect, and it is very surprising to users who don’t read documentation (e.g., many/most users) that an optimization flag will opt them into a language dialect. Optimization flags are intended to take correct code and make it faster, not to change the semantics of correct code. The fact that Clang doesn’t document this option appropriately is unfortunate, but documentation will not help users because the name of the option is the issue. Users see “there’s an optimization level to make my code fast? Great, that’s the one I want to use.” and documentation will only help a small percentage of them to understand the nuances.

I realize there are strong opinions on this RFC, but given how surprising it is for an optimization level to opt you into a non-conforming language dialect and how trivial the workaround is for users to explicitly write -O3 -ffast-math, which may force some number of users to actually think about what that means, I think deprecation and removal is the correct path forward.

Also, it’s worth noting that “GCC behaves this way” is not a particularly compelling technical argument. Clang is not a drop-in replacement for GCC; we aim for compatibility when we think that’s best for users, but this is a case where I think the problems outweigh the benefits of compatibility, especially if we are loudly alerting users that the flag is deprecated. That said, I would encourage reaching out to GCC developers to see if they have an appetite for change as well.

In terms of this RFC, I think the best path forward is to continue gathering feedback until the discussion dies down in case there are other technical arguments made in either direction, and then call consensus from there.

kparzysz · May 7, 2024, 4:44pm

I think as long as we appear to be a drop-in replacement for GCC, users will act as if we were.

If we don’t want to be tied to GCC’s behavior (at least in the eyes of our users) we should make it harder for people to think that we are. In the extreme that could mean deprecating all options and inventing our own, but in the scope of -Ofast deprecating/removing it is much better than silently changing it to do something different.

Topic		Replies	Views
RFC: Amend deprecation of Ofast Clang Frontend	0	300	July 20, 2024
[RFC] Deprecate Ofast in Flang Flang	8	483	January 21, 2025
RFC: The meaning of -Ofast Flang	17	2663	May 1, 2024
RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags LLVM Dev List Archives	39	237	November 23, 2016
[RFC] Making `ffast-math` option unrelated to `ffp-contract` option Clang Frontend	13	1284	July 13, 2022

RFC: Deprecate -Ofast

Related topics