pb05 benchmarks for llvm/dragonegg 3.2

Duncan,
   With the commit from http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121203/158488.html,
the Polyhedron 2005 benchmarks complete again on x86_64-apple-darwin12. The result are similar to what
were seen with FSF gcc 4.6.2svn and llvm/dragonegg 3.0 (which was the last release that passed pb05)
http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-October/044091.html.
         Jack
ps Has an exhaustive effort been made yet to insure that llvm/dragonegg isn't still unnecessarily scalarizing
the vector code generated by FSF gcc? If that issue were completely solved, llvm/dragonegg might become faster
than vanilla FSF gcc.

FSF gcc 4.7.2 with llvm/dragonegg 3.2 branch

a) de-gfortran47 -msse3 -ffast-math -funroll-loops -O3 %n.f90 -o %n
b) de-gfortran47 -msse3 -ffast-math -funroll-loops -O3 -fplugin-arg-dragonegg-enable-gcc-optzns %n.f90 -o %n
c) gfortran-fsf-4.7 msse3 -ffast-math -funroll-loops -O3 %n.f90 -o %n

Run time (secs)

Benchmark de-gfortran47 de-gfortran47+optzns gfortran47
ac 12.28 8.02 8.17
aermod 15.88 14.54 16.49
air 7.02 5.42 5.80
capacita 39.97 34.93 32.53
channel 2.08 2.10 1.83
doduc 27.17 27.59 25.71
fatigue 8.75 7.80 8.31
gas_dyn 12.11 4.64 3.98
induct 24.03 11.86 12.11
linpk 15.49 15.47 15.46
mdbx 11.90 11.31 11.18
nf 29.34 29.67 28.01
protein 36.31 35.33 31.98
rnflow 27.27 26.74 24.67
test_fpu 11.31 9.13 7.91
tfft 1.93 1.94 1.86

Geom. Mean 13.27 11.02 10.64

Compile time (secs)

Benchmark de-gfortran47 de-gfortran47+optzns gfortran47
ac 0.33 1.63 1.72
aermod 21.20 29.47 42.25
air 1.13 2.66 4.38
capacita 0.51 1.00 1.85
channel 0.32 0.52 0.64
doduc 1.79 3.74 5.84
fatigue 0.91 1.29 1.93
gas_dyn 0.65 1.32 3.34
induct 1.73 2.81 3.93
linpk 0.22 0.51 0.91
mdbx 0.64 1.28 2.09
nf 0.39 0.79 2.07
protein 1.11 1.95 4.30
rnflow 1.25 2.87 6.32
test_fpu 0.87 2.25 5.14
tfft 0.21 0.35 0.58

Executable (bytes)

Benchmark de-gfortran47 de-gfortran47+optzns gfortran47
ac 26768 47144 59104
aermod 1039416 1065048 1396928
air 61924 65948 110752
capacita 41328 45424 77904
channel 22720 26680 34688
doduc 128360 140564 205304
fatigue 69736 69800 90224
gas_dyn 58936 67232 123664
induct 163072 167296 179064
linpk 18664 26976 42624
mdbx 53580 57684 90216
nf 23864 36176 84056
protein 74944 87128 131960
rnflow 71784 92344 205576
test_fpu 54088 74520 179448
tfft 18552 18400 30664

Hi Jack, thanks for these numbers.

    With the commit from http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121203/158488.html,
the Polyhedron 2005 benchmarks complete again on x86_64-apple-darwin12. The result are similar to what
were seen with FSF gcc 4.6.2svn and llvm/dragonegg 3.0 (which was the last release that passed pb05)
http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-October/044091.html.
          Jack
ps Has an exhaustive effort been made yet to insure that llvm/dragonegg isn't still unnecessarily scalarizing
the vector code generated by FSF gcc?

As far as I know, no effort has been made at all.

  If that issue were completely solved, llvm/dragonegg might become faster

than vanilla FSF gcc.

Another issue is that, until recently, LLVM didn't have much in the way of
fast-math optimizations. It should be better in 3.3.

Ciao, Duncan.

Hi Jack, thanks for these numbers.

    With the commit from http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121203/158488.html,
the Polyhedron 2005 benchmarks complete again on x86_64-apple-darwin12. The result are similar to what
were seen with FSF gcc 4.6.2svn and llvm/dragonegg 3.0 (which was the last release that passed pb05)
http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-October/044091.html.
          Jack
ps Has an exhaustive effort been made yet to insure that llvm/dragonegg isn't still unnecessarily scalarizing
the vector code generated by FSF gcc?

As far as I know, no effort has been made at all.

Duncan,
    Could you propose a testing patch that would emit warnings on each instance
of scalarization of vectors for use with llvm/dragonegg trunk? I would be happy
to file the PR's for any of those instances in the pb05 testsuite compilation
using it.
         Jack

Duncan,
  I tried adding...

Index: lib/Transforms/Vectorize/LoopVectorize.cpp