More on atlas and clang

Vincent_Habchi · March 10, 2013, 2:52pm

Hi there,

I have recently undertaken another experimental build of Atlas (http://math-atlas.sourceforge.net – briefly speaking, Atlas provides a highly complete BLAS/LAPACK implementation optimized for the native architecture of the computer on which it is running) on an AVX machine (MacMini 2011) using a snapshot of clang 3.3 (r173279) provided by MacPorts (http://macports.org), with -O3, -fPIC, -fvectorize and -fslp-vectorize flags.

I am please to say that:

1. The generated AVX code seems fine: a full test session run under an Atlas-based SciPy didn’t raise any error;
2. The performance seems now on-par or even (sometimes surprisingly) better than the ‘reference GCC’ – whatever that means (I was unable to get in touch with Atlas developer at that time) – as evidenced by the table below:

Reference clock rate=3292Mhz, new rate=2300Mhz
Refrenc : % of clock rate achieved by reference install
Present : % of clock rate achieved by present ATLAS install

                   single precision double precision
           ******************************** *******************************
                 real complex real complex
           --------------- --------------- --------------- ---------------
Benchmark Refrenc Present Refrenc Present Refrenc Present Refrenc Present
========= ======= ======= ======= ======= ======= ======= ======= =======
kSelMM 1289.9 1407.4 1188.7 1229.8 686.7 826.8 647.4 682.1
kGenMM 198.2 239.7 198.5 237.8 193.9 231.8 196.0 233.8
kMM_NT 193.7 266.4 195.2 192.9 184.2 187.4 188.5 197.5
kMM_TN 198.5 211.1 197.9 226.2 189.8 227.6 189.5 223.2
BIG_MM 1213.8 1346.7 1241.3 1366.5 652.0 789.5 661.4 795.8
  kMV_N 224.3 308.1 438.8 617.3 115.9 152.1 205.8 283.5
  kMV_T 224.6 313.5 460.3 642.9 123.2 159.6 211.3 288.2
   kGER 148.3 192.4 290.2 381.2 73.3 95.6 144.3 184.3

This is in stark contrast with the previous test where clang were lagging about 20% beyond the ‘reference implementation’ based on GCC for lines 2, 3 and 4 where compiler performance matters most.

So – to summarize in two words: kudos folks!

I will build another version on a Core2Duo machine tonight and see if the results are consistent.

Cheers!
Vincent

Topic		Replies	Views
Clang build of ATLAS (and speed comparison) Clang Frontend	9	75	September 12, 2011
food for optimizer developers Clang Frontend	2	59	August 10, 2010
clang vs gcc on ROOT... Clang Frontend	1	74	October 22, 2010
clang miscompile prevents ATLAS build Clang Frontend	6	64	September 5, 2011
food for optimizer developers Clang Frontend	7	56	August 11, 2010

More on atlas and clang

Related Topics