There’s a runtime error in povray from SPEC 2017 FP on AArch64 and this seems to be related to fast math flags. With -O3 or -Ofast -ffp-model=precise it runs fine, but it fails with -Ofast or O3 -ffp-model=fast.
This seems to have been discussed earlier here and the subsequent replies. The difference is that that mentioned SPEC 2006, but now I see problems in SPEC 2017 too. The error I see is this one:
Parse Error: Viewing angle has to be smaller than 180 degrees.
which I suspect is still related to fast math settings given previous discussions and that it is passing with non-fast-math flags.
Before I dig more into this, I was curious if others are seeing the same thing and/or have some thoughts on this.
Sidenote: I’d really like it if we could delete the -Ofast flag, and require people use -O3 -ffast-math explicitly, if that’s what they actually want.
I find it to be an extemely unfortunate option, because it conflates optimization level and semantics-breaking changes of ffast-math, while appearing to be simply another -O<level> like -O2. In my experience, nobody – with the exception of this thread – expects that an optimization-level flag breaks ieee float behaviors.
I certainly wouldn’t encourage people to use -Ofast, but it’s a gcc flag and just deleting it would present compatibility problems.
My take on this is that I don’t like ‘nnan’ and ‘ninf’ being part of -ffast-math. Again, that’s a gcc option and we’re following their definition of it, but I think LLVM is a bit more aggressive in optimizing based on ‘nnan’ and ‘ninf’ than gcc is. An older version of the gcc documentation described “-ffast-math” as “Might allow some programs designed to not be too dependent on IEEE behavior for floating-point to run faster, or die trying.” It’s at least honest, but I’d prefer not to die trying.
Most of the fast-math option just allow value-changing optimizations, but ‘ninf’ and ‘nnan’ can lead completely incorrect code generation if NaN or inf values are found (not so fun fact: all values can compare as equal to NaN with clang or gcc on x86-based targets if you use fast-math – Compiler Explorer). I know that’s what you’re signing up for when you use nnan and ninf, but I’m not sure most users understand that’s what they’re getting with -ffast-math and -Ofast. They might actually want -funsafe-math-optimizations, but that sounds worse than -ffast-math, doesn’t it?
I’d at least like to take the no-honor-nans and no-honor-infinities behavior out of -ffp-model=fast. Would anyone object to that?
IIRC -ffinite-math-only make substantial contribution to the performance gain provided by -ffast-math. Its removal from -ffp-model=fast may result in performance drop, which is undesirable for users who use this option.
The root of the problem is agressive compiler behavior, when it ignores explicit user checks made with calls to functions like isnan or by some compare instructions like this.
Similarly 5a36904c515b was reverted from a surprise from this kind of change. I think I just need to resubmit it, as defined nothing wrong is happening
It really depends on what the specification for -ffinite-math-only is supposed to be. GCC describes it as:
Allow optimizations for floating‐point arithmetic that assume that arguments and results are not NaNs or ±Infs.
But, in fact, neither GCC nor Clang actually implement that specification. Instead, in both compilers, it’s closer to “assume no Inf or NaN values ever are ever created/accessed by anything”
There’s a significant distinction – the GCC documentation definition would not prohibit NaN/Inf values from being assigned, compared, converted from float->double or vice-versa, queried with isnan, etc. Only arithmetic can be assumed to not be provided, or result in, these values.
I don’t really mind what the semantics are, because I don’t think anyone should ever use -ffast-math (nor any of the more detailed flags), regardless…
But, perhaps users really do want the GCC documentation’s definition, not the GCC and Clang implementation…
I’m not exactly sure which “that” you’re referring to, but I do believe your commit is a correct optimization both under LLVM IR semantics, and under the semantics I described as what Clang and GCC currently provide. But, invalid under the semantics GCC documented.
If we want to provide the documented-not-implemented semantics, we should likely do so by having Clang emit nofpclass/nnan/ninf in far fewer locations, not by modifying IR semantics. In which case, your patch is still correct.
Perhaps the problem is that the description is too vague. When it says “arguments and results” does it mean arguments and results of function calls? I suppose that’s a natural way of reading it, but that’s obviously not the way it’s ever been handled. I’ve interpreted (misread?) it as “operands and results of instructions…” because that’s closer to the way it’s been implemented. With regard to the LLVM IR language definition, I think only “operands and results of instructions” makes sense, because why would an “fadd” instruction care about the rules for arguments and return values of function calls?
In any event, assuming we intend for it to mean what the implementation assumes it means we should clean up the documentation.