Profile-Guided Optimization (PGO) related questions and suggestions

First question:

Answer: IR PGO uses MST to find instrumentation point with help of static branch prediction to minimize instrumentation overhead. It also has early cleanup pipeline including early inliner thus further reduces overhead. The early inliner also enables more precise context dependent profile thus the runtime performance (of the profile-use build) is also better. You can easily benchmarking this by yourself. In summary, for performance, IR PGO should be used. Front-end instrumentation is better used for coverage testing. IR PGO and frontend PGO have similar resistance to source change, but IR PGO is also sensitive to compiler pipeline change – e.g., CFG change for instrumentation point can lead to mismatches.

Second question:
Answer: instrumentation PGO has two profile format: raw and indexed. For raw profile format, there is no backward compatibility, but for indexed format, it is guaranteed that old version of profile can be consumed by new profile reader.

Third question:
Answer: the goal is attractive, but unrealistic to be implemented. Assuming IR PGO, the control flow produced by different compiler can be different and the ways instrumentation points are selected can be very different, unless all compilers also uses debug information for profile matching purpose.

Forth question:
IR PGO can be sensitive to optimization related options especially options that can change pipelines and inline threshold.

Fifth question:
For different platforms, platform specific parameters can make this hard – e.g. Arm and x86’s call overhead modeling is different. This affects inlining decisions.

If target specific code is not much, you may consider using Sample (PMU) based PGO (aka AutoFDO). The LBR based profile from Intel platform can be used to optimized for Arm.

David

2 Likes