Which optimizations currently use sample PGO in llvm? I’m talking about PGO which uses commonly available hardware counters to inspect the execution of a program. I found that inline seems to use sample PGO, is there any other optimization use sample PGO?
Thank you!
I believe sample profiles are consumed in SampleProfile.cpp
which seems to set call count frequency using LLVM IR metadata. See this test for a simple example of this metadata. Like you said, this should impact inlining decisions. I’m not super familiar with what other optimizations use BFI or what sample PGO does differently from IRPGO, so CC @WenleiHe who may be able to say more.
For completeness, here are the docs on profiling: Clang Compiler User’s Manual — Clang 17.0.0git documentation
PGO provides a framework for optimization to query hotness of a function/block. Many optimizations uses PGO through these hotness query APIs to make speed vs size trade off decisions. Inliner, code layout and loop optimizations (unrolling, vectorizer, sinking, …) are a few examples, and there are more.