RFC: Add ISD nodes for mad

Hi,

I would like to add an ISD node for an fmad operation (named either FMAD or FMULADD). It will have the semantics of returning the same result as the separate multiply and add with the intermediate rounding step, and not a differently rounded result. The motivation is to share code with the various FMA forming DAG combines, but will be generally more useful on targets that support it since it is OK to form without non-default -fp-contract modes.

This would also be associated with a minor change to the description of the llvm.fmuladd intrinsic. The fmuladd intrinsic currently has what I think are poorly worded semantics for how it should behave:

"...is equivalent to the expression a * b + c, except that rounding will not be performed between the multiplication and addition steps if the code generator fuses the operations. Fusion is not guaranteed, even if the target platform supports it."

This is odd because it doesn't guarantee fusion, but if fusion does happen, the intermediate rounding is eliminated. This is an odd constraint to put on the backend that in my reading doesn't allow ever forming a mad with the same intermediate rounding. This is not helpful since the fmuladd intrinsic expands to the fmul and fadd in SelectionDAGBuilder unless fp-contraction is globally enabled. According to this constraint, you could never form a mad because if it was expanded from the fmuladd intrinsic, fusion is not allowed. I don't think this arbitrary constraint on the backend makes sense, so the semantics of fmuladd would be changed to mean that the intermediate rounding may or may not happen, but not constrain the backend on only fusing if the intermediate rounding is eliminated.

-Matt

Hi Matt,

Maybe I’m missing something, but I’m unclear on how adding this ISD node will be a net improvement. It seems like it will open a big can of worms with respect to canonical representations, and that a huge amount of both target-indepednent and target-specific code will have to be updated for it.

—Owen

Hi Matt,

Maybe I’m missing something, but I’m unclear on how adding this ISD node will be a net improvement. It seems like it will open a big can of worms with respect to canonical representations, and that a huge amount of both target-indepednent and target-specific code will have to be updated for it.

—Owen

I don't think the impact will be very big. I was thinking it would just default to Expand by default, and only form it in DAGCombiner post legalization.

The motivation here is to eliminate the part where I copy a sizable amount of code out of the generic DAG combiner in the 4th patch here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150126/255129.html

From: "Matt Arsenault" <Matthew.Arsenault@amd.com>
To: "llvmdev@cs.uiuc.edu" <LLVMdev@cs.uiuc.edu>
Sent: Wednesday, January 28, 2015 1:52:59 PM
Subject: [LLVMdev] RFC: Add ISD nodes for mad

Hi,

I would like to add an ISD node for an fmad operation (named either
FMAD
or FMULADD). It will have the semantics of returning the same result
as
the separate multiply and add with the intermediate rounding step,
and
not a differently rounded result. The motivation is to share code
with
the various FMA forming DAG combines, but will be generally more
useful
on targets that support it since it is OK to form without non-default
-fp-contract modes.

Why not just reuse the existing FMA node, but add an additional parameter to indicate whether enhanced intermediate precision is required?

This would also be associated with a minor change to the description
of
the llvm.fmuladd intrinsic. The fmuladd intrinsic currently has what
I
think are poorly worded semantics for how it should behave:

"...is equivalent to the expression a * b + c, except that rounding
will
not be performed between the multiplication and addition steps if the
code generator fuses the operations. Fusion is not guaranteed, even
if
the target platform supports it."

This is odd because it doesn't guarantee fusion, but if fusion does
happen, the intermediate rounding is eliminated. This is an odd
constraint to put on the backend that in my reading doesn't allow
ever
forming a mad with the same intermediate rounding. This is not
helpful
since the fmuladd intrinsic expands to the fmul and fadd in
SelectionDAGBuilder unless fp-contraction is globally enabled.
According
to this constraint, you could never form a mad because if it was
expanded from the fmuladd intrinsic, fusion is not allowed. I don't
think this arbitrary constraint on the backend makes sense, so the
semantics of fmuladd would be changed to mean that the intermediate
rounding may or may not happen, but not constrain the backend on only
fusing if the intermediate rounding is eliminated.

The fusion is an observable effect, using a "combination instruction" that does not change the rounding behavior is always allowed under the normal "as if" rule. We don't ever constrain the code generators to produce specific instructions at the IR level (even when target-specific intrinsics are used, which sometimes surprises people).

-Hal

Because that would break a lot of existing backends?

—Owen

Can’t you have “optional” parameters? I mean the number of parameter you can add to a node is never checked or defined, it is implicit from the ISD OpCode AFAIK.
If you add a parameter to FMA, wouldn’t it just be ignored by the existing backends?

You can have variadic nodes, but they are difficult to work with. It wouldn't be worth it for this.

From: "Owen Anderson" <resistor@mac.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Matt Arsenault" <Matthew.Arsenault@amd.com>, "llvmdev@cs.uiuc.edu" <LLVMdev@cs.uiuc.edu>
Sent: Wednesday, January 28, 2015 6:54:23 PM
Subject: Re: [LLVMdev] RFC: Add ISD nodes for mad

From: "Matt Arsenault" < Matthew.Arsenault@amd.com >
To: " llvmdev@cs.uiuc.edu " < LLVMdev@cs.uiuc.edu >
Sent: Wednesday, January 28, 2015 1:52:59 PM
Subject: [LLVMdev] RFC: Add ISD nodes for mad

Hi,

I would like to add an ISD node for an fmad operation (named either
FMAD
or FMULADD). It will have the semantics of returning the same result
as
the separate multiply and add with the intermediate rounding step,
and
not a differently rounded result. The motivation is to share code
with
the various FMA forming DAG combines, but will be generally more
useful
on targets that support it since it is OK to form without non-default
-fp-contract modes.

Why not just reuse the existing FMA node, but add an additional
parameter to indicate whether enhanced intermediate precision is
required?

Because that would break a lot of existing backends?

Fair enough.

Generally speaking, I think that having some mechanism to share the DAGCombines between the two kinds of FMAs is a good idea (I took a quick look through the existing DAGCombine code, and it seems worth sharing and most of them (except for the combines including fpext nodes) won't be affected by the precision changes). I'd need to see a patch to be sure, however.

-Hal

So do you think I should attempt to write a patch that adds this node, or find some other method for sharing these? Would something analogous to TLI.getRecipEstimate() be preferrable?

-Matt

From: "Matt Arsenault" <Matthew.Arsenault@amd.com>
To: "Hal Finkel" <hfinkel@anl.gov>, "Owen Anderson" <resistor@mac.com>
Cc: "llvmdev@cs.uiuc.edu" <LLVMdev@cs.uiuc.edu>
Sent: Wednesday, February 4, 2015 5:03:44 PM
Subject: Re: [LLVMdev] RFC: Add ISD nodes for mad

>> From: "Owen Anderson" <resistor@mac.com>
>> To: "Hal Finkel" <hfinkel@anl.gov>
>> Cc: "Matt Arsenault" <Matthew.Arsenault@amd.com>,
>> "llvmdev@cs.uiuc.edu" <LLVMdev@cs.uiuc.edu>
>> Sent: Wednesday, January 28, 2015 6:54:23 PM
>> Subject: Re: [LLVMdev] RFC: Add ISD nodes for mad
>>
>>
>>
>>
>> From: "Matt Arsenault" < Matthew.Arsenault@amd.com >
>> To: " llvmdev@cs.uiuc.edu " < LLVMdev@cs.uiuc.edu >
>> Sent: Wednesday, January 28, 2015 1:52:59 PM
>> Subject: [LLVMdev] RFC: Add ISD nodes for mad
>>
>> Hi,
>>
>> I would like to add an ISD node for an fmad operation (named
>> either
>> FMAD
>> or FMULADD). It will have the semantics of returning the same
>> result
>> as
>> the separate multiply and add with the intermediate rounding step,
>> and
>> not a differently rounded result. The motivation is to share code
>> with
>> the various FMA forming DAG combines, but will be generally more
>> useful
>> on targets that support it since it is OK to form without
>> non-default
>> -fp-contract modes.
>>
>> Why not just reuse the existing FMA node, but add an additional
>> parameter to indicate whether enhanced intermediate precision is
>> required?
>>
>> Because that would break a lot of existing backends?
> Fair enough.
>
> Generally speaking, I think that having some mechanism to share the
> DAGCombines between the two kinds of FMAs is a good idea (I took a
> quick look through the existing DAGCombine code, and it seems
> worth sharing and most of them (except for the combines including
> fpext nodes) won't be affected by the precision changes). I'd need
> to see a patch to be sure, however.
>
> -Hal

So do you think I should attempt to write a patch that adds this
node,
or find some other method for sharing these? Would something
analogous
to TLI.getRecipEstimate() be preferrable?

Try a patch adding the new ISD node; that's probably the more straightforward way.

-Hal