Profile-guided optimization (PGO, aka feedback-directed optimization, FDO) is one of the main tools we have for improving the performance of generated code. Maintaining the profile information throughout compilation passes is critical to its effectiveness. For example, if a pass mutates the CFG but does not update, or accidentally removes, profile information, passes downstream will deal with degraded profile information, despite analyses like BFI “trying to do their best” to “guess” the profile - the information is gone.
This is further complicated because, in some cases, some profile information genuinely cannot be synthesized by the pass. For example:
SimplifyCFGOpt::SimplifyBranchOnICmpChain may slit a predicate into an early check, which only takes part of that predicate (IIUC, something like
if (A && B) then… else <not block>) becomes
if A then <...> else <not block>) - but we won’t know the edge probabilities for the new edges resulting from the “just check A” condition; also, see D159322, D158642 or D157462.
We can’t just rely on today’s unit testing, because it requires both patch author and patch reviewers to know and care about profile information, and also check the values make sense.
The proposal is to enhance today’s unit testing, in 2 phases. The second phase is more exploratory, so this RFC mentions it mostly as food for thought and invitation for suggestions. The first phase, however, should be fairly straightforward and provide a good chunk of value fairly easily.
This is complementary to other efforts like those discussed in D158889, that are about integration testing, and probably require more involved infrastructure, and is meant (the proposal here) as a “first line of defense”.
- Passes that create edges with unknowable weights must explicitly indicate that using a new metadata attached to the terminator, “
MD_prof_unknown” (final name TBD).
- Have an option enabling a profile verifier that’s run as part of
llc(Note: only if llc takes IR as input, afaik can’t capture profiles in MIR input). After the module is loaded by
llc, the verifier injects profile information into the IR, if it’s not available, before any passes are run; then after they are run, the verifier checks profile information is still present.
To stress this aspect: the verifier only checks that non-trivial terminators have either
MD_prof_unknown attached to them at the end of the test. It doesn’t check for correct values.
There are 3 kinds of tests, for our purposes, under
llvm/test: (1) module doesn’t have any profile information whatsoever (vast majority); (2) module comes with profile info metadata (~360 files), and (3) profile info is loaded (~50, sample + instrumented). (OK, I suppose there could be a “4th”: a mix of 2 and 3).
For the relatively few that have some profile embedded, we can start by skipping them from this verification (easy check: “if module has profile, don’t insert more, and don’t verify at the end”); and the very few that ingest profiles, we can manually exclude them. Later, we can figure out if it’s worth doing something special for these, but likely the functionality they test isn’t likely to have the problem we’re trying to detect (accidentally dropping profiles).
For the rest, we could insert the synthetic weights currently computed by profile analyses, or bias them. Since the goal at this stage is just checking profile info isn’t dropped, the synthetic ones are sufficient.
We would roll this out gradually:
- first, we add the new
MD_prof_unknownmetadata and make the BFI/BPI analyses handle it (effectivelly, they just need to drop it to replace it with calculated values. The latter is what happens today in the absence of data, so this should be a simple change)
- we add the validators to
llc, disabled by default, enable-able by flag
- we gradually enable the flag, via lit.cfg, on subdirs - i.e. we make sure the subdir is “green” and make necessary fixes (bugs or just explicitly setting
MD_prof_unknown) before enabling the flag
- eventually we can flip the flag to on by default
- This won’t affect in any way passes that don’t change the CFG
- It is possible to “cheat” by making low-level APIs setting profile data, if not specified, to
MD_prof_unknown. Presumably this is feasible to check centrally (like by OWNERS of those APIs)
- This - as explicitly stated - only checks a pass made an explicit stance to “what the profile of new edges” is. It does not check the profile is correct, so for example what D143948 fixed would still go unnoticed. This segways to “phase 2”.
To check the profile information still “makes sense” - i.e. for example we did keep profile metadata, but forgot to flip probabilities for a conditional when we flipped the condition - we can run a second step of control flow integrity checks and accept variations within some margin. We’d probably need to use synthetic profiles that are more biased (e.g. 10%-90% probabilities for conditionals).
Like mentioned in introduction, this phase is mentioned here as food for thought.