Error running spec benchmark with FMA4 on X86

Hi All,

I am facing miscompare error when running povray (and few other C/C++ benchmarks) from spec cpu2006 suite enabling FMA4 (and disabling FMA3). I have used -ffp-contract=fast to turn on this option. (Compilation options and targets pasted below).

clang version 3.2 (trunk 163295:163308) (llvm/trunk 163295)
Target: x86_64-unknown-linux-gnu
Thread model: posix

(Options to clang)

-O3 -march=bdver2 -mavx -mno-fma -mfma4 -ffp-contract=fast -save-temps

Note that BDVER2 supports both FMA3 and FMA4. Also the benchmark was run successfully when FMA3 was enabled. Reducing the testcase might take more time but has anyone noticed this issue?

For those interested, miscompare message is as below

*** Miscompare of SPEC-benchmark.log; for details see
CPU2006/peak_ref_llvm.0037/SPEC-benchmark.log.mis

0050: Pixels: 1280 Samples: 8960
Pixels: 1280 Samples: 47360
^
0051: Rays: 8960 Saved: 0 Max Level: 1/5
Rays: 47360 Saved: 0 Max Level: 1/5
^
0055: CSG Intersection 232960 0
Box 47360 0
^
0056: Plane 537600 510304
Cone/Cylinder 1515520 0
^

Hi All,

I am facing miscompare error when running povray (and few other C/C++ benchmarks) from spec cpu2006 suite enabling FMA4 (and disabling FMA3). I have used -ffp-contract=fast to turn on this option. (Compilation options and targets pasted below).

clang version 3.2 (trunk 163295:163308) (llvm/trunk 163295)
Target: x86_64-unknown-linux-gnu
Thread model: posix

(Options to clang)

-O3 -march=bdver2 -mavx -mno-fma -mfma4 -ffp-contract=fast -save-temps

<<<<<<<

Note that BDVER2 supports both FMA3 and FMA4. Also the benchmark was run successfully when FMA3 was enabled. Reducing the testcase might take more time but has anyone noticed this issue?

What is more interesting is that DragonEgg compiled for the same llvm revision does well with FMA4. (Note that for Dragonegg case, DAGCombiner.cpp is tweaked to do FMA check for AllowFpFusion() == Standard instead of Fast. Only then FMA4 gets generated. This is just for experimental purposes)