Currently when the option
-ffast-math is set, it alters the option
-ffp-contract in this way:
-fno-fast-math => if
-ffp-contract= fast then
-ffp-contract is unchanged
Connecting these two options have triggered some spurious warnings; see issue clang warning passing "-ffast-math -fno-fast-math" · Issue #54625 · llvm/llvm-project · GitHub.
The source of the problem is that the
-ffast-math option is an on/off flag, but the
ffp-contract is an on/off/fast flag. So when
fno-fast-math is used there is no obvious value for
Note that gcc doesn’t connect the two options.
There is a WIP patch open here: D123630 [WIP] Remove connection between ‘ffast-math’ and ‘ffp-contract’. (llvm.org)
The changes we are proposing will impact a few CodeGen and Driver LIT tests, listed in the patch above.
There will be performance regression for users who are expecting
ffast-math to enable
ffp-contract but because
ffp-contract is on by default, the impact would be minimal.
The clang changes we are proposing were exercised on the llvm-test-suite on the SingleSource benchmark. Originally this benchmark runs with the option
FMA disabled. I removed these flags and replace them with the
ffast-math flag. Because of the nature of the
ffast-math option, I have disabled the FP checks in the suite, to avoid the expected accuracy fails.
See the changes made for SingleSource in D124095 [WIP] Experimental changes (llvm.org) . With these changes, the SingleSource is passing using the flag
Were you able to get performance comparisons with the SingleSource benchmarks after your changes to the tests?
I would expect the difference between fp-contract=on and fp-contract=fast to be small and may not show up at all if the benchmark is specifically trying to trigger FMA. The most significant difference is that with fp-contract=on, the llvm.fmuladd() intrinsic will be generated if the mul and add appear in the same source expression, but not if they appear in different expressions. With fp-contract=fast, the intrinsic is never used, so there might be some cases where additional optimizations are performed across multiple operations, but in theory that can interfere with FMA formation.
In general, GCC has limited handling of fp-contract. I don’t believe it supports the STDC FP_CONTRACT pragma, and it doesn’t distinguish between fast and on. I think it only does fast, and that is effectively contract by the -std setting. So following GCC behavior is problematic, but since -f[no-]fast-math comes from GCC it kind of makes sense to have the same behavior.
To me, the strongest argument for the change you are proposing is that fp-contract is a three-state setting, while fast-math is only on or off, so there’s no clear answer as to what -fno-fast-math should do if we leave fp-contract connected to -f[no-]fast-math.
I’m not sure regressing a well-established invariant is a reasonable solution to resolve such a diagnostic.
Presumably that diagnostic should not be issued at all.
I compared the SingleSource benchmark with and without the fast-math patch. The original numbers are the exec time with clang (no changes) and the new numbers are with clang where the ffast-math option has no effect on ffp-contract (current RFC change).
There are about 1820 test cases. Among these 139 test cases have seen increase in their exec-time. 101 of them are gcc-torture tests and the 38 others are in various suites.
The gcc-torture suite has an increase varying from 20 to 175%. The rest varies between 1 and 200%.
It looks like there is no way of attaching a sheet or a word document. So attaching a png of a portion of the results.
I think most of the SingleSource binaries are mostly for testing and not really for benchmarking, so the insight from measuring them is likely limited. For benchmarking, SPEC and other benchmark suites likely will provide more reliable performance data.
I agree that changing
-ffast-math to behave differently seems sub-optimal. I think it makes sense for
-ffast-math to set
-ffp-contract=fast from the user’s perspective and it has been doing so for a long time.
IMO if the user provides
-ffast-math -fno-fast-math it would make sense to leave the original value of -ffp-contract` unchanged.
The problem isn’t so much with
-ffast-math -fno-fast-math as it is with things like
-ffp-contract=fast -fno-fast-math and
-ffast-math -ffp-contract=fast -fno-fast-math. The current logic sets fp-contract back to ‘on’ in these cases.
I suppose we could add more complicated tracking that would allow us to determine whether fp-contract was changed by an explicit -ffp-contract option. There’s still some ambiguity, but I guess we could define a rule that would be consistent and defensible. Suppose we see
-ffp-contract=fast -ffast-math -fno-fast-math. In this case, the user explicitly set fp-contract, but fast-math also set it. In that case, we should leave it at fast, right? And if the we see
-ffp-contract=off -ffast-math -fno-fast-math we’d set it to off. Where it becomes less clear is a case like
-ffp-contract=off -ffast-math with fast-math left on. Does the explicit fp-contract take precedence in this case? I think probably not.
That gives me a set of rules like this:
- If -ffast-math and -ffp-contract are both seen but -fno-fast-math has not been seen after the last instance of -ffast-math, the fp-contract setting is determined by whichever was seen last
- If -fno-fast-math is seen and -ffp-contract has been seen at least once, the last seen value of -ffp-contract is used
- If -fno-fast-math is seen and -ffp-contract has not been seen, the default setting of fp-contract is used
I think that would give users the expected behavior and we wouldn’t need a diagnostic.
Would everyone agree with this?
ffast-math → fp-contract=fast
ffast-math + fp-contract = val → fp-contract=val
ffast-math + fp-contract =val + fno-fast-math → fp-contract=val
ffast-math + fno-fast-math → fp-contract=on
fno-fast-math + ffp-contract=fast->fp-contract=fast
Minor correction here – the result should be fast for Cuda and HIP.
And a couple of cases, I’d add to make sure everyone is in agreement:
fp-contract =val + ffast-math + fno-fast-math → fp-contract=val
fp-contract =val + ffast-math → fp-contract=fast
In the first case you have that
-fno-fast-math does not affect the fp-contraction, while in the second case
-fno-fast-math does affect the fp-contraction.
I don’t understand why the behavior of
-fno-fast-math should depend on which options were listed before .
-fno-fast-math should behave like the other high level floating point options – i.e. it is just an alias for a sequence of low level options.
IMO the set low level properties that
-fno-fast-math affect should be the same.
ffast-math affect the fp-contraction property then also
-fno-fast-math should affect the fp-contraction property. Given that
-fno-fast-math should disable unsafe floating point optimizations then it makes sense to me that
-ffp-contract=on – i.e. fp-contraction is set to the default value.
The issue is that disabling unsafe optimizations ca be done using fp-contract=on or fp-contract=off, which really means taking the unchanged value of fp-contract in this rule: ffast-math + fp-contract =val + fno-fast-math → fp-contract=val.
If we want to follow what you are suggesting then we would have to live with these kinds of warnings which was the main reason for this RFC.
A combination of these 2 flags -ffast-math -fno-fast-math would produce this nonsense warning for the user:
warning: overriding ‘-ffp-contract=fast’ option with ‘-ffp-contract=on’
This makes sense to me! It would be good if this would also be clearly spelled out in the docs.
It is not clear to me what was the original reason to introduce such warnings, so I don’t know if the warning being emitted when the user-provided command line has
-ffast-math -fno-fast-math is intentional or not.
Maybe the wording of the warning should be improved to reference the user-provided options that implies a given low level option for clarity. E.g.:
warning: overriding '-ffp-contract=fast' option (implied by '-ffast-math') with '-ffp-contract=on' (implied by '-fno-fast-math')
Moreover I wonder if such warning should be emitted by default (perhaps it should be emitted only in pedantic or other opt-in mode) given that we should be documenting precisely which options are implied by the high level floating point options (
-ffp-model= and similar).
For example in https://clang.llvm.org/docs/UsersManual.html#cmdoption-ffast-math it is clearly stated that
-ffp-contract=fast, but there is no documentation of