Condition code in DAGCombiner::visitFADDForFMACombine?

I’m curious why the condition to fuse is this:

// Floating-point multiply-add with intermediate rounding.

bool HasFMAD = (LegalOperations && TLI.isOperationLegal(ISD::FMAD, VT));

static bool isContractable(SDNode *N) {
SDNodeFlags F = N->getFlags();
return F.hasAllowContract() || F.hasAllowReassociation();
}

bool CanFuse = Options.UnsafeFPMath || isContractable(N);
bool AllowFusionGlobally = (Options.AllowFPOpFusion == FPOpFusion::Fast || CanFuse || HasFMAD);
// If the addition is not contractable, do not combine.
if (!AllowFusionGlobally && !isContractable(N))
return SDValue();

Specifically the AllowFusionGlobally, I would have expected something more like:

bool AllowFusionGlobally = (Options.AllowFPOpFusion == FPOpFusion::Fast && CanFuse && HasFMAD);

or at the very least:

bool AllowFusionGlobally = ((Options.AllowFPOpFusion == FPOpFusion::Fast || CanFuse) && HasFMAD);

It seems that as long as the target can do fmad it does do fmad since HasFMAD is true.

Thanks.

So I have a test case where:

%20 = fmul nnan arcp float %15, %19
%21 = fadd reassoc nnan arcp contract float %20, -1.000000e+00

is being contracted in DAG to fmad. Is this correct since the fmul has no reassoc or contract fast math flag?

Thanks.

fmad is defined as the exact same result as the separate fmul + fadd, unlike fma so this is OK

-Matt

Matt,
I’m sorry, actually it’s fma not fmad.

In the post-legalizer DAG combine for the given code it’s producing fma not fmad. That doens’t seem correct.

The contract is on the fadd. I’m not really sure what the rule is supposed to be for contract between the nodes. The LangRef doesn’t clarify on this. I would assume it would need to be present on both?

-Matt

The code issue for me seems to be here:

// Is the node an FMUL and contractable either due to global flags or
// SDNodeFlags.
auto isContractableFMUL = [AllowFusionGlobally](SDValue N) {
if (N.getOpcode() != ISD::FMUL)
return false;
return AllowFusionGlobally || isContractable(N.getNode());
};

The problem is that I’d like to have fast math set but also no contract nodes that don’t specifically have contract/reassoc.

As far as being present on both, I’m not sure why that would be the case. If one instruction doesn’t have contract or reassoc, then it should be allowed to be contracted or reassociated despite any other instruction flags.

That’s my interpretation of what it should mean, otherwise, you’d have to pair each possible combination.

For this code:

%20 = fmul reassoc nnan arcp contract float %15, %19
%21 = fadd nnan arcp float %20, -1.000000e+00

This does not result in fused multiply-add.

it seems like the logic to contact the fmul from the fadd is different than whether to decide to contract the fadd. I would think the logic would be the same for each instruction in the pair.

Why is isContractableFMUL different from isContractable?

If contractable is only expected on the fmul, then what does it even mean to have it set on any other kind of instruction?

So why is there ‘isContractable’ too?

And why are they treated differently?

So I have a test case where:

%20 = fmul nnan arcp float %15, %19
%21 = fadd reassoc nnan arcp contract float %20, -1.000000e+00

is being contracted in DAG to fmad. Is this correct since the fmul has no reassoc or contract fast math flag?

By having the reassoc and contract flags on fadd, the frontend is essentially saying "different rounding on the value produced by the fadd is okay".

So I'd say contracting this to fma is correct.

Where does this code come from, and why do you think contracting to fma is wrong?

Cheers,
Nicolai

The example starts as SPIR-V with the NoContraction decoration flag on the fmul.

I think what you are saying seems valid in that if the user had put the flag on the fadd instead of the fmul it would not contract and so in this example the user needs to put the NoContraction on the fadd though I’m not sure that’s a good expectation of the user. On the surface, I think that if an operation didn’t have the contract flag than it wouldn’t be contracted, regardless of what flags any other operation has.

The example starts as SPIR-V with the NoContraction decoration flag on the fmul.

I think what you are saying seems valid in that if the user had put the flag on the fadd instead of the fmul it would not contract and so in this example the user needs to put the NoContraction on the fadd though I'm not sure that's a good expectation of the user. On the surface, I think that if an operation didn't have the contract flag than it wouldn't be contracted, regardless of what flags any other operation has.

Okay, I see that the SPIR-V spec specifically calls out this example.

Unless there are conflicting requirements with another frontend, I'd say we should make sure LLVM is aligned with SPIR-V here. Something along the lines of (in LangRef):

``contract``
    Allow floating-point contraction (e.g. fusing a multiply followed by
    an addition into a fused multiply-and-add). This flag must be present
    on all affected instruction.

And we should probably say the same about ``reassoc`` as well.

Cheers,
Nicolai

This is probably going to effect on other backends and break llvm-lit for them?

This is probably going to effect on other backends and break llvm-lit for them?

Very likely, yes. Can you take a look at how big the fallout is? This might give us a hint about what other frontends might expect, and who needs to be involved in the discussion (if one is needed).

Cheers,
Nicolai

Just a quick change of isContractableFMUL to isContractable gives about 49 unexpected failures.

Just a quick change of isContractableFMUL to isContractable gives about 49 unexpected failures.

That may not be the right change to be looking at. Consider:

 LLVM :: CodeGen/X86/fma\_patterns\.ll

This file doesn't use any `contract` or `reassoc` flags on IR instructions.

I don't know what's changed there, but the point is that the test does not depend on the exact semantics of `contract` and `reassoc`.

Cheers,
Nicolai

Nicolai,

I don’t think that’s the correct change either but I wanted to get a quick summary of potential failures. You’re right it’s not entirely dependent on ‘contract’ or ‘reassoc’ when the global flag is set, ie -fp-contract=fast, as in your X86 example, this is essentially setting ‘contract’ on every instruction.

I think it would be add a ‘nocontract’ fast math flag. The issue is that when -fp-contract=fast is set and there isn’t a ‘contract’ or ‘reassoc’ on an instruction, how do you not contract it and still contract other instructions that also don’t have the ‘contract’ or ‘reassoc’ set when -fp-contract is present.

-Ryan