[RFC] Profile Information Propagation Unittesting

Contributors: @xur @chandlerc @snehasish

Problem

Profile-guided optimization (PGO, aka feedback-directed optimization, FDO) is one of the main tools we have for improving the performance of generated code. Maintaining the profile information throughout compilation passes is critical to its effectiveness. For example, if a pass mutates the CFG but does not update, or accidentally removes, profile information, passes downstream will deal with degraded profile information, despite analyses like BFI “trying to do their best” to “guess” the profile - the information is gone.

This is further complicated because, in some cases, some profile information genuinely cannot be synthesized by the pass. For example: SimplifyCFGOpt::SimplifyBranchOnICmpChain may slit a predicate into an early check, which only takes part of that predicate (IIUC, something like if (A && B) then… else <not block>) becomes if A then <...> else <not block>) - but we won’t know the edge probabilities for the new edges resulting from the “just check A” condition; also, see D159322, D158642 or D157462.

We can’t just rely on today’s unit testing, because it requires both patch author and patch reviewers to know and care about profile information, and also check the values make sense.

Proposal

The proposal is to enhance today’s unit testing, in 2 phases. The second phase is more exploratory, so this RFC mentions it mostly as food for thought and invitation for suggestions. The first phase, however, should be fairly straightforward and provide a good chunk of value fairly easily.

This is complementary to other efforts like those discussed in D158889, that are about integration testing, and probably require more involved infrastructure, and is meant (the proposal here) as a “first line of defense”.

Phase 1: just check profile info isn’t dropped

  1. Passes that create edges with unknowable weights must explicitly indicate that using a new metadata attached to the terminator, “MD_prof_unknown” (final name TBD).
  2. Have an option enabling a profile verifier that’s run as part of opt and llc (Note: only if llc takes IR as input, afaik can’t capture profiles in MIR input). After the module is loaded by opt/llc, the verifier injects profile information into the IR, if it’s not available, before any passes are run; then after they are run, the verifier checks profile information is still present.

To stress this aspect: the verifier only checks that non-trivial terminators have either MD_prof or MD_prof_unknown attached to them at the end of the test. It doesn’t check for correct values.

There are 3 kinds of tests, for our purposes, under llvm/test: (1) module doesn’t have any profile information whatsoever (vast majority); (2) module comes with profile info metadata (~360 files), and (3) profile info is loaded (~50, sample + instrumented). (OK, I suppose there could be a “4th”: a mix of 2 and 3).

For the relatively few that have some profile embedded, we can start by skipping them from this verification (easy check: “if module has profile, don’t insert more, and don’t verify at the end”); and the very few that ingest profiles, we can manually exclude them. Later, we can figure out if it’s worth doing something special for these, but likely the functionality they test isn’t likely to have the problem we’re trying to detect (accidentally dropping profiles).

For the rest, we could insert the synthetic weights currently computed by profile analyses, or bias them. Since the goal at this stage is just checking profile info isn’t dropped, the synthetic ones are sufficient.

We would roll this out gradually:

  • first, we add the new MD_prof_unknown metadata and make the BFI/BPI analyses handle it (effectivelly, they just need to drop it to replace it with calculated values. The latter is what happens today in the absence of data, so this should be a simple change)
  • we add the validators to opt and llc, disabled by default, enable-able by flag
  • we gradually enable the flag, via lit.cfg, on subdirs - i.e. we make sure the subdir is “green” and make necessary fixes (bugs or just explicitly setting MD_prof_unknown) before enabling the flag
  • eventually we can flip the flag to on by default

Observations

  • This won’t affect in any way passes that don’t change the CFG
  • It is possible to “cheat” by making low-level APIs setting profile data, if not specified, to MD_prof_unknown. Presumably this is feasible to check centrally (like by OWNERS of those APIs)
  • This - as explicitly stated - only checks a pass made an explicit stance to “what the profile of new edges” is. It does not check the profile is correct, so for example what D143948 fixed would still go unnoticed. This segways to “phase 2”.

Phase 2: check profile info “still makes sense”

To check the profile information still “makes sense” - i.e. for example we did keep profile metadata, but forgot to flip probabilities for a conditional when we flipped the condition - we can run a second step of control flow integrity checks and accept variations within some margin. We’d probably need to use synthetic profiles that are more biased (e.g. 10%-90% probabilities for conditionals).

Like mentioned in introduction, this phase is mentioned here as food for thought.

1 Like

This sounds very similar to opt -enable-debugify which does something similar with debug info.

The general idea of checking that we don’t drop profile info is great, but I’m worried that turning on the flag by default is going to be confusing to people debugging LLVM and seeing an MD_prof_unknown that isn’t in the original IR. And it’ll also affect printing the final IR unless you strip it out at the end.

Adding the flag and seeing what breaks if you turn it on by default and run check-llvm, then adding regression tests for those is a good first start.

How often have you seen issues that this would catch?

To clarify, MD_prof_unknown is something the pass author inserts to say “I don’t have sufficient information to calculate profile here, so the analyses (BFI/BPI) should do their best”. Today they’d do nothing, and we can’t derive intent from the lack of information. So it’s not something the profile validator would insert. The profile validator would insert a real-looking profile.

But I agree, folks could get confused if they saw a profile suddenly appearing in the IR, and we can strip it post-validation, like you said (this, itself, can be flag-controlled, if folks find value in that profile). Note also we only insert the profile, and then validate it, if the .ll didn’t have one to begin with.

There were some recent bugs, @xur would know the details, but here’s the problem: the scale of the data involved here and the absence of testing of this type, means we don’t actually know how big the problem is. This is actually why we need to do this :slight_smile:. Or maybe I’m not answering your question.

I assume that MD_prof_unknown would only get attached if the “original” branches also had MD_prof? I don’t want MD_prof_unknown to appear out of thin air in IR that has no branch weights.

This will still cause confusion when looking at -debug or -print-after-all output, even if it’s stripped from final IR.

I’m also not clear how this mode is supposed to interact with tests for passes that have branch weight dependent behavior (i.e. not proper PGO passes but things like SimplifyCFG that have some branch weight heuristics). The test behavior would entirely change if additional MD_prof are attached.

Generally I’m a pretty hard -1 on anything that is going to inject MD_prof by default or on a wide scale in existing tests. The (thankfully very few) -debugify tests we have are enough of an annoyance, we don’t need more of this.

This kind of usage seems reasonable.

I misread, I meant whatever MD_prof was injected at the beginning.

Supposedly most code paths in LLVM are tested somewhere, so we should be able to flush out any existing issues by implementing this and running check-llvm with the flag turned on.

Nothing auto-attaches MD_prof_unknown, see above - it’s for us engineers to use to say “I know I created a branch, and I know I shouldn’t drop profile info, but I genuinely can’t tell you what the branch probabilities are”.

How - you mean if there are tests that do -debug or -print-after-all?

I think we’re in agreement here - the RFC does say this would be disabled by default, and we roll it out gradually (this implies “for areas it doesn’t break”). But eventually I would imagine we get to a stage where we know what needs to be opted out (for reasons like those you touched on) and so it’d make sense to flip the defaults, because the whole point is to check if folks don’t forget profile info. Not sure how -debugify rolled out, so maybe I’m missing something.

(meanwhile @aeubanks 's reply came in, so replying to that here)

I think there are 2 topics

  • one is adding the new metadata, change BFI/BPI to consume that and replace it with valid prodile info, and a validator (behind a flag) in opt and llc ← this is all a NFC, because nothing is “on”
  • the second is about rollout, which is proposed to be gradual. I think we’re in agreement about how to fish out possible bugs, identify opt-out cases (and potentially add new regression tests), etc - all without turning the tree red, of course. But the end goal (in the RFC) of this is to arrive at a steady state where regressions in profile propagation are caught in test.

To be clear also: I’m not proposing we landed anything before doing some local experimentation; maybe I should have made that more clear (apologies). I wanted though to collect some initial feedback (which I’m getting - thanks!)

Right. What I wanted to clarify here is that it would only be used if it’s not possible to preserve existing branch weights, i.e. we have an existing branch(es) with weights and we transform it into a new one where we cannot determine reasonable weights, so we mark it as MD_prof_unknown. But if the original branch had no weights, we would never add MD_prof_unknown on the new one, correct?

I guess that could happen for tests checking -debug output, but I was mostly thinking about manually rerunning a test with -debug here and getting confusing output.

I don’t think we are: IMHO this should never be “rolled out”. It’s okay to have the functionality so you can easily do a one-off check that all tests still pass verification, but I don’t think this should infect the test suite at large, even in the long term. (This could also be done on a buildbot: A possible approach would be to write the verification failures out into a separate file and then just check whether that has any, while ignoring actual test failures caused by output differences.)

Our current lit IR tests are pretty much “what you see is what you get” and this is going to add a good bit of “magic” into the mix.

Well… good point. Maybe there’s a difference between “unknown” and “not existent”? Like the latter would be in the case of cold edges - you get a profile, the profile has them as cold… is this what you’re referring to? Maybe we want a MD_prof_nonexistent? (just brainstorming)

A buildbot where this is on by default would achieve the goal here, yes. I think I saw “on its own buildbot” and “on by default” as equivalent, your remark about -debug & testing, and this last paragraph, raise a consideration I wasn’t aware of. So in short, I have no problem doing this on a buildbot - main goal is to get to that point where something visible turns red when folks forget about profiles.

Okay, that sounds good to me :slight_smile:

Great idea.

FWIW: Currently you can use “branch_weights” 0, 0 to mark “unknown” branches as that will already have the effect of there being a marker but the BranchProbabilityAnalysis will ignore those clearly useless weights getting you the default behavior… Though I also wouldn’t mind having a separate MD_prof_unknown for that to have a more obvious name.