possible PowerPC (32bits) backend bug

I have been doing some playing with the patterns that define complex
instructions, and I saw a behavior that doesn’t look right.
I think its a bug in the PPC backend.

The 32-bit PPC .td file defines a pattern for the fnmsubs instruction like this:

def : Pat<(fsub F4RC:$B, (fmul F4RC:$A, F4RC:$C)),
(FNMSUBS F4RC:$A, F4RC:$C, F4RC:$B)>,
Requires<[FPContractions]>;

The unique feature of this parttern is that it maps a pair of
LLVM IR instructions into a single PPC instruction.

def FNMSUBS : AForm_1<59, 30,(outs F4RC:$FRT), (ins F4RC:$FRA, F4RC:$FRC, F4RC:$FRB),
“fnmsubs $FRT, $FRA, $FRC, $FRB”, FPGeneral,
[(set F4RC:$FRT, (fneg (fsub (fmul F4RC:$FRA, F4RC:$FRC),F4RC:$FRB)))]>,
Requires<[FPContractions]>;

Now I wrote a little toy program that, when compiled, uses this instruction.
Here is the program:
#include<stdio.h>
int main()
{
float a,b,c;
b = b * c;
a = a - b;
return 0;
}

And here is the assembly:

.text .global main .type main, @function .align 2 main: lfs 0, -8(1) lfs 1, -12(1) li 3, 0 fmuls 2, 1, 0 stfs 2, -12(1) lfs 2, -16(1) fnmsubs 0, 1, 0, 2 stfs 0, -16(1) stw 3, -20(1) stw 3, -4(1) BB1_1: # return lwz 3, -4(1) blr .size main,.-main

At a glance, it looks right. Line 12 is, indeed the “fnmsubs” command, so
the pattern did work. But look at Line 9. Here we see that the “fmuls”
also happened! In effect, this means that the fmul happens TWICE.

That can’t be right can it? Unfortunately, I don’t have a PPC emulator,
so I can’t run the code and see if it actually works or not.
But it does not look right to me.

It is a problem, right? Is there any solution? Because I would like to
also use multiple-IR patterns, for the backend I am working on.

Thank you for your assistance,
Kao Chang

At a glance, it looks right. Line 12 is, indeed the "fnmsubs" command, so
the pattern did work. But look at Line 9. Here we see that the "fmuls"
also happened! In effect, this means that the fmul happens TWICE.

That can't be right can it? Unfortunately, I don't have a PPC emulator,
so I can't run the code and see if it actually works or not.
But it does not look right to me.

I suppose it isn't ideal, but it isn't really bad: the results of both
the fmuls and the fnmsubs instructions are used.

It is a problem, right? Is there any solution? Because I would like to
also use multiple-IR patterns, for the backend I am working on.

You can explicitly check that there's only a single use of a node if
you want; for example, see and_su in X86InstInfo.td.

-Eli

The code is correct. The mul is being multiply evaluated because you're looking at -O0 code, try an equivalent example with -O2 or something.

-Chris