X86 rcp instruction generated

Hi,

We have implemented the rcp instruction generation for X86 target architecture. We have introduced a flag -fp-rcp flag which controls the generatation of X86 rcp instruction generation.

We have observed minor effects on precision and hence hve put these transformations under the mentioned flag.

Note that –fp-rcp is only enabled with -enable-unsafe-fp-math flag presently.

Moreover we have achieved some derived optimizations along with rsqrt generations.

Following is the details of the -fp-rcp flag along with its values and enabled optimizations.

-fp-rcp

=off - No rcp

=on - y/x => y * rcp(x) // Standard

=fda - Standard, Derive FMA i.e. y/x +z => y * rcp(x) + z => vfmaddss y rcp(x) z.

This is termed as FDA(Fused Division Accumulate)

Sending the code patch(on llvm svn revision 167927), text description and testcases attached with this mail. Please review.

Future enhance plans are as follows.

TODO:

  1. Enable vector rsqrt generation.

  2. Generate different variations of FDA i.e. FMSUB, FNMSUB,FNMADD instruction generations as required.

Best Regards,

soham

"The search for truth is more precious than its possession."

rcp_167927.patch (10.1 KB)

rcp-description.txt (2.71 KB)

rcp-fda.ll (4.68 KB)

rcp-on.ll (4.15 KB)