I recently posted an RFC under the clang category proposing that -ffp-contract=fast
should honor pragmas rather than requiring -ffp-contract=fast-honor-pragmas
( see [RFC] Honor pragmas with -ffp-contract=fast). There seemed to be good support for this as a general concept, but it hit a road block when I looked at how various backends handle floating-point contraction.
I proposed implementing the change by having clang set TargetOptions::AllowFpOpFusion = FPOpFusion::Standard
rather than FPOpFusion::Fast
for -ffp-contract=fast
. Unfortunately, this caused several backends to stop generating FMA operations entirely. An implication of this discovery, BTW, is that these backends will not generate FMA operations if -ffp-contract=fast-honor-pragmas
.
After some investigation, I found that there is a general lack of consistency in the way that backends handle floating-point contraction, and that many backends haven’t been updated to recognize that contraction can be enabled or disabled at the instruction level. These backends rely on either a global flag that enables contraction everywhere, or the UnsafeFpMath attributes, which enables it at the function level.
The big problem with the global option is that it doesn’t allow for mixed modes within a module. The option naming points to C pragmas as a way that contraction can be turned on or off, but you can also get into a mixed state by using LTO and linking IR from sources that were compiled with different -ffp-contract
options. I’m sure this is OK for a lot of use cases, but given that we have the ability to express at the instruction-level whether contraction is allowed or not, I’d like to see us bring all in-tree back ends up-to-date with that aspect of the IR.
As I said, the implementation of this is fairly inconsistent. I do see two common patterns. One is some form of:
allowFMA = Options.AllowFPOpFusion == FPOpFusion::Fast ||
Options.UnsafeFPMath;
The other is some form of:
allowFMA = Options.AllowFPOpFusion == FPOpFusion::Fast ||
Options.UnsafeFPMath ||
Node->getFlags().hasAllowContract();
Unfortunately, it’s not a simple matter of just adding the node-level checks everywhere that the global flag is being checked. In many cases, the global check is being performed in a way that makes it difficult to check the node flags on all nodes that would be involved. For example, the NVPTX backend has allowFMA
and noFMA
predicates in its pattern matchers that don’t reference the nodes involved at all.
Does anyone have any ideas on how to get started on moving all backends to a common level of support for the contract flag at the node level? Would anyone object to such a project?