food for optimizer developers

I wrote a Fortran to C++ conversion program that I used to convert selected
LAPACK sources. Comparing runtimes with different compilers I get:

                          absolute relative
ifort 11.1.072 1.790s 1.00
gfortran 4.4.4 2.470s 1.38
g++ 4.4.4 2.922s 1.63
clang++ 2.8 (trunk 108205) 6.487s 3.62

This is under Fedora 13, 64-bit, 12-core Opteron 2.2GHz

All files to easily reproduce the results are here:

See the README file or the example commands below.


- Why is the code generated by clang++ so much slower than the g++ code?

- Is there anything I could do in the C++ code generation or in the "fem"
  Fortran EMulation library to help runtime performance?


tar zxf lapack_fem_001.tgz
cd lapack_fem_001
clang++ -o dsyev_test_clang++ -I. -O3 -ffast-math dsyev_test.cpp
time dsyev_test_clang++

Have you tried profiling the resulting program?


Have you tried profiling the resulting program?


ifort and g++, just enough to convince myself there isn't something silly
due to the conversion to C++.
50% of the time is spent in two lines of code.
I haven't profiled clang++, mainly because I think I couldn't do much about
the 2x speed difference compared to g++ anyway.