LLVM-17 optimization levels comparison

antmo · October 27, 2023, 12:55pm

Hi LLVM! I recently ran some benchmarks with LLVM 17 on AArch64.
Here are some of the results, in case anyone is interested.

This shows variations in code size and execution time for different optimization levels, LTO (flto) and PGO (profile-instr) optimizations.

SPEC2017, C/C++ benchs only. Ref is -O2, lower is faster, to the left is smaller.
Min exec time of 3 runs (on FX700 AArch64 machine) on train dataset. PGO trained on the same dataset.

IMO, it is interesting to see once again the performance and size tradeoffs available with the different optimization levels.
At -O2 & -O3 levels, LTO and PGO give good results in both code size (10/20%) and execution time (5/10%) dimensions.
At -Os & -Oz levels, PGO only improves performance, and slightly increases size. Probably not expected, haven’t looked at it yet.

Any comments welcome ; )

shibata-fj · November 13, 2023, 11:11am

I have previously checked the performance effect of PGO in SPECrate 2017 / LLVM-14 and noticed the following.

PGO has a good effect on the workloads that have a lot of branches and small functions: perlbench, gcc and xalancbmk. I think that the compiler can perform better inlining.
Compared to the frontend PGO (-fprofile-instr-generate), IR based PGO(-fpriflie-generate) gives well-balanced performance improvement. When I tried, frontend-PGO gave a negative effect on mcf’s performance but IR-PGO improved it.

It is good to know for me that PGO and LTO have a positive impact even on code size.
Thank you.

Topic		Replies	Views
Current PGO status LLVM Dev List Archives	8	108	February 26, 2018
About Clang llvm PGO LLVM Dev List Archives	3	75	May 9, 2016
Profile-Guided Optimization (PGO) related questions and suggestions LLVM Project pgo	24	1180	December 20, 2023
RFC: PGO Late instrumentation for LLVM LLVM Dev List Archives	1	88	September 2, 2015
LTO query LLVM Dev List Archives	6	172	May 11, 2018

LLVM-17 optimization levels comparison

Related Topics