Measurements of the new inlinehint attribute

Friday I enabled the inlinehint function attribute in the inliner. It mostly affects the performance of -Os compiled code. I have made some measurements on the SPEC test suite to show what it means.

I made three runs of then nightly tests. The baseline represents -Os with no inlinehint:

make TEST=nightly OPTFLAGS=-Os EXTRA_LOPT_OPTIONS=-inlinehint-threshold=0 EXTRA_LINKTIME_OPT_FLAGS=-inlinehint-threshold=0 report

For the second run I enabled the inline hint, keeping the -Os flag:

make TEST=nightly OPTFLAGS=-Os report

The third run is -O3:

make TEST=nightly OPTFLAGS=-O3 EXTRA_LOPT_OPTIONS=-inline-threshold=275 EXTRA_LINKTIME_OPT_FLAGS=-inline-threshold=275 report

Note the when the test suite is run like this, the inliner is actually run twice. Once by opt and once by llvm-ld.

The table shows bytecode size and runtime speed (higher=faster) of the second and third runs relative to the baseline:

r96241 -Os Speed -O3 Speed
HMMER/hmmcalibrate 0.10% 0.00% 18.18% 0.00%
Nurbs/nurbs 0.00% 0.00% 49.01% -0.81%
Povray/povray 0.01% 6.02% 39.36% -2.53%
SPEC/CFP2000/177.mesa/177.mesa 0.00% 0.00% 14.90% -1.12%
SPEC/CFP2000/179.art/179.art 0.00% 0.00% 19.51% 1.22%
SPEC/CFP2000/183.equake/183.equake 0.00% -1.85% 3.54% 0.00%
SPEC/CFP2000/188.ammp/188.ammp 0.28% -0.18% 48.68% 3.10%
SPEC/CFP2006/433.milc/433.milc 0.00% -0.14% 20.31% 2.68%
SPEC/CFP2006/444.namd/444.namd 0.04% 0.44% 3.28% 1.40%
SPEC/CFP2006/447.dealII/447.dealII 10.61% 13.06% 35.52% 15.01%
SPEC/CFP2006/450.soplex/450.soplex 0.30% 0.00% 22.47% 0.00%
SPEC/CFP2006/470.lbm/470.lbm 0.00% 0.00% 4.91% 0.30%
SPEC/CINT2000/164.gzip/164.gzip 0.00% 0.17% 32.44% -4.93%
SPEC/CINT2000/175.vpr/175.vpr 0.00% 1.01% 18.17% 3.34%
SPEC/CINT2000/176.gcc/176.gcc 0.31% 0.98% 32.86% 3.00%
SPEC/CINT2000/181.mcf/181.mcf 0.00% -0.61% 11.02% -0.15%
SPEC/CINT2000/186.crafty/186.crafty 0.00% 0.00% 23.97% 3.14%
SPEC/CINT2000/197.parser/197.parser 1.18% 1.32% 47.48% 6.23%
SPEC/CINT2000/252.eon/252.eon 2.39% 3.45% 15.34% 11.11%
SPEC/CINT2000/253.perlbmk/253.perlbmk 0.13% -0.41% 33.45% 1.67%
SPEC/CINT2000/254.gap/254.gap 0.24% -0.98% 13.90% 1.50%
SPEC/CINT2000/255.vortex/255.vortex 0.00% 0.00% 94.96% -6.59%
SPEC/CINT2000/256.bzip2/256.bzip2 0.00% -0.09% 37.42% 1.84%
SPEC/CINT2000/300.twolf/300.twolf 0.00% 0.00% 9.59% 0.96%
SPEC/CINT2006/400.perlbench/400.perlbench 0.33% 0.40% 35.88% -2.45%
SPEC/CINT2006/401.bzip2/401.bzip2 0.00% -0.94% 69.38% -0.94%
SPEC/CINT2006/403.gcc/403.gcc 0.76% 0.00% 48.35% 1.20%
SPEC/CINT2006/429.mcf/429.mcf 0.00% -1.78% 11.88% 0.61%
SPEC/CINT2006/445.gobmk/445.gobmk 0.02% 0.00% 13.86% 0.00%
SPEC/CINT2006/456.hmmer/456.hmmer 0.17% 1.72% 28.38% 1.72%
SPEC/CINT2006/458.sjeng/458.sjeng 0.19% 1.35% 8.97% 6.05%
SPEC/CINT2006/462.libquantum/462.libquantum 1.08% -20.22% 146.24% -7.26%
SPEC/CINT2006/464.h264ref/464.h264ref 0.00% -0.30% 9.22% 0.72%
SPEC/CINT2006/471.omnetpp/471.omnetpp 2.78% 1.92% 67.24% 3.92%
SPEC/CINT2006/473.astar/473.astar 4.59% 6.61% 12.90% -0.87%
SPEC/CINT2006/483.xalancbmk/483.xalancbmk 4.29% 0.00% 34.72% 0.00%
SPEC/CINT95/099.go/099.go 0.00% -3.13% 46.93% 0.00%
SPEC/CINT95/124.m88ksim/124.m88ksim 0.06% 0.00% 11.62% 50.00%
SPEC/CINT95/126.gcc/126.gcc 0.24% 0.00% 36.70% 0.00%
SPEC/CINT95/129.compress/129.compress 0.00% 0.00% 0.27% 0.00%
SPEC/CINT95/130.li/130.li 0.15% 0.00% 77.44% 0.00%
SPEC/CINT95/132.ijpeg/132.ijpeg 0.00% 0.00% 11.12% 0.00%
SPEC/CINT95/134.perl/134.perl 3.55% 0.00% 33.44% 0.00%
SPEC/CINT95/147.vortex/147.vortex 0.00% 0.00% 94.80% 0.00%

Some of the tests run quite quickly, so the speed numbers should be taken with a grain of salt.

These rows have interesting changes:

SPEC/CFP2006/447.dealII/447.dealII 10.61% 13.06% 35.52% 15.01%
SPEC/CINT2000/197.parser/197.parser 1.18% 1.32% 47.48% 6.23%
SPEC/CINT2000/252.eon/252.eon 2.39% 3.45% 15.34% 11.11%
SPEC/CINT2006/456.hmmer/456.hmmer 0.17% 1.72% 28.38% 1.72%
SPEC/CINT2006/458.sjeng/458.sjeng 0.19% 1.35% 8.97% 6.05%
SPEC/CINT2006/462.libquantum/462.libquantum 1.08% -20.22% 146.24% -7.26%
SPEC/CINT2006/471.omnetpp/471.omnetpp 2.78% 1.92% 67.24% 3.92%
SPEC/CINT2006/473.astar/473.astar 4.59% 6.61% 12.90% -0.87%

The general picture should be that -Os performance is moved closer to -O3, but at a much smaller price in code size.

I have no idea what is wrong with 462.libquantum, and 473.astar appears to really dislike -O3 optimization.

/jakob