Profile-Guided Optimization (PGO) related questions and suggestions

davidxl · November 25, 2023, 10:13pm

First question:

Answer: IR PGO uses MST to find instrumentation point with help of static branch prediction to minimize instrumentation overhead. It also has early cleanup pipeline including early inliner thus further reduces overhead. The early inliner also enables more precise context dependent profile thus the runtime performance (of the profile-use build) is also better. You can easily benchmarking this by yourself. In summary, for performance, IR PGO should be used. Front-end instrumentation is better used for coverage testing. IR PGO and frontend PGO have similar resistance to source change, but IR PGO is also sensitive to compiler pipeline change – e.g., CFG change for instrumentation point can lead to mismatches.

Second question:
Answer: instrumentation PGO has two profile format: raw and indexed. For raw profile format, there is no backward compatibility, but for indexed format, it is guaranteed that old version of profile can be consumed by new profile reader.

Third question:
Answer: the goal is attractive, but unrealistic to be implemented. Assuming IR PGO, the control flow produced by different compiler can be different and the ways instrumentation points are selected can be very different, unless all compilers also uses debug information for profile matching purpose.

Forth question:
IR PGO can be sensitive to optimization related options especially options that can change pipelines and inline threshold.

Fifth question:
For different platforms, platform specific parameters can make this hard – e.g. Arm and x86’s call overhead modeling is different. This affects inlining decisions.

If target specific code is not much, you may consider using Sample (PMU) based PGO (aka AutoFDO). The LBR based profile from Intel platform can be used to optimized for Arm.

David

Topic		Replies	Views
Status of IR vs. frontend PGO (fprofile-generate vs fprofile-instr-generate) Clang Frontend	13	848	June 15, 2021
The state of IRPGO (3 remaining work items) LLVM Dev List Archives	40	99	June 27, 2016
Proposal: add instrumentation for PGO and code coverage Clang Frontend	11	599	September 10, 2013
PGO is ineffective for Rust - but why? LLVM Dev List Archives	21	209	December 3, 2019
Capabilities of Clang's PGO (e.g. improving code density) LLVM Dev List Archives	14	150	May 28, 2015

Profile-Guided Optimization (PGO) related questions and suggestions

Related topics