Contributors: @xur @chandlerc @snehasish
Problem
Profile-guided optimization (PGO, aka feedback-directed optimization, FDO) is one of the main tools we have for improving the performance of generated code. Maintaining the profile information throughout compilation passes is critical to its effectiveness. For example, if a pass mutates the CFG but does not update, or accidentally removes, profile information, passes downstream will deal with degraded profile information, despite analyses like BFI âtrying to do their bestâ to âguessâ the profile - the information is gone.
This is further complicated because, in some cases, some profile information genuinely cannot be synthesized by the pass. For example: SimplifyCFGOpt::SimplifyBranchOnICmpChain
may slit a predicate into an early check, which only takes part of that predicate (IIUC, something like if (A && B) then⌠else <not block>)
becomes if A then <...> else <not block>
) - but we wonât know the edge probabilities for the new edges resulting from the âjust check Aâ condition; also, see D159322, D158642 or D157462.
We canât just rely on todayâs unit testing, because it requires both patch author and patch reviewers to know and care about profile information, and also check the values make sense.
Proposal
The proposal is to enhance todayâs unit testing, in 2 phases. The second phase is more exploratory, so this RFC mentions it mostly as food for thought and invitation for suggestions. The first phase, however, should be fairly straightforward and provide a good chunk of value fairly easily.
This is complementary to other efforts like those discussed in D158889, that are about integration testing, and probably require more involved infrastructure, and is meant (the proposal here) as a âfirst line of defenseâ.
Phase 1: just check profile info isnât dropped
- Passes that create edges with unknowable weights must explicitly indicate that using a new metadata attached to the terminator, â
MD_prof_unknown
â (final name TBD). - Have an option enabling a profile verifier thatâs run as part of
opt
andllc
(Note: only if llc takes IR as input, afaik canât capture profiles in MIR input). After the module is loaded byopt
/llc
, the verifier injects profile information into the IR, if itâs not available, before any passes are run; then after they are run, the verifier checks profile information is still present.
To stress this aspect: the verifier only checks that non-trivial terminators have either MD_prof
or MD_prof_unknown
attached to them at the end of the test. It doesnât check for correct values.
There are 3 kinds of tests, for our purposes, under llvm/test
: (1) module doesnât have any profile information whatsoever (vast majority); (2) module comes with profile info metadata (~360 files), and (3) profile info is loaded (~50, sample + instrumented). (OK, I suppose there could be a â4thâ: a mix of 2 and 3).
For the relatively few that have some profile embedded, we can start by skipping them from this verification (easy check: âif module has profile, donât insert more, and donât verify at the endâ); and the very few that ingest profiles, we can manually exclude them. Later, we can figure out if itâs worth doing something special for these, but likely the functionality they test isnât likely to have the problem weâre trying to detect (accidentally dropping profiles).
For the rest, we could insert the synthetic weights currently computed by profile analyses, or bias them. Since the goal at this stage is just checking profile info isnât dropped, the synthetic ones are sufficient.
We would roll this out gradually:
- first, we add the new
MD_prof_unknown
metadata and make the BFI/BPI analyses handle it (effectivelly, they just need to drop it to replace it with calculated values. The latter is what happens today in the absence of data, so this should be a simple change) - we add the validators to
opt
andllc
, disabled by default, enable-able by flag - we gradually enable the flag, via lit.cfg, on subdirs - i.e. we make sure the subdir is âgreenâ and make necessary fixes (bugs or just explicitly setting
MD_prof_unknown
) before enabling the flag - eventually we can flip the flag to on by default
Observations
- This wonât affect in any way passes that donât change the CFG
- It is possible to âcheatâ by making low-level APIs setting profile data, if not specified, to
MD_prof_unknown
. Presumably this is feasible to check centrally (like by OWNERS of those APIs) - This - as explicitly stated - only checks a pass made an explicit stance to âwhat the profile of new edgesâ is. It does not check the profile is correct, so for example what D143948 fixed would still go unnoticed. This segways to âphase 2â.
Phase 2: check profile info âstill makes senseâ
To check the profile information still âmakes senseâ - i.e. for example we did keep profile metadata, but forgot to flip probabilities for a conditional when we flipped the condition - we can run a second step of control flow integrity checks and accept variations within some margin. Weâd probably need to use synthetic profiles that are more biased (e.g. 10%-90% probabilities for conditionals).
Like mentioned in introduction, this phase is mentioned here as food for thought.