[RFC] Codegen new pass manager pipeline construction design

The new pass manager has been working with the middle-end optimization pipeline for a while now, and it would be a great project to make the backend codegen pipeline also work with the new pass manager so that at some point in the future we can delete the legacy pass manager and have one pass manager to rule them all.

The new pass manager for the codegen pipeline has some implementation in-tree, but it’s been pretty organic growth so far and I’d like to properly design it out with feedback from the community (especially those more familiar with the codegen pipeline) before we get into a state where it’s too hard to change design. This focuses on how we’ll create the codegen pipeline programmatically in code, which is the main challenge so far.

Using the New Pass Manager — LLVM 21.0.0git documentation has some notes on the principles of the new pass manager for the optimization pipeline. One notable one is the desire to make pass nesting (e.g. a function pass manager nested in a module pass manager) explicit in the code that builds the pipeline, as opposed to the legacy pass manager which takes a linear list of passes (regardless of if it’s a module/function/etc pass) and schedules the nesting.

There are a couple of things that make the new pass manager design quite a bit trickier to work with the codegen pipeline than the optimization pipeline.

  • There are many passes in the codegen pipeline shared between all the backends, but currently each backend may disable specific shared passes, add custom passes before/after other arbitrary passes, or substitute shared passes with custom passes.
  • Backends may also override parts of the pipeline with their own set of passes (e.g. TargetPassConfig::addISelPrepare()).
  • There are many out-of-tree backends that we want to continue supporting.
    • But out-of-tree backends must still have their code under llvm-project linked in like any existing backend, they cannot exist e.g. as a shared library (as opposed to middle-end plugins).
  • llc {start,stop}-{before,after} allows people to write tests that only run a portion of the codegen pipeline.

Currently in-tree for the codegen new pass manager we have Add[IR,Machine]Pass which are very similar to the legacy pass manager’s implicit nesting where we add a bunch of passes without worrying about nesting and there is infrastructure to convert that to the actual nesting structure that’s run. I’d really like to avoid this if possible since I believe it’s valuable to be able to see the nesting in code, as people less familiar with how passes are ordered are forced to think about this. The legacy pass manager’s implicit nesting has made it hard to notice when this nesting is unexpectedly broken, e.g. a module pass breaks up the function passes.

We’ll still have an equivalent of TargetPassConfig::add*Passes() where backends can override a function to customize certain parts of the pipeline.

Allowing individual backends to add passes at certain points of the pipeline is already something the optimization pipeline does via callbacks invoked on a pass manager at hardcoded points in the pipeline. This should be ok with the codegen pipeline as long as there aren’t too many points in the pipeline where this can happen. The legacy pass manager API for this is TargetPassConfig::insertPass. As far as I can tell

  • PPC does this for 1 pass
  • Hexagon does this for 3 passes
  • AMDGPU does this for 10 passes

This is a manageable number of extra passes.

Some targets disable some set of passes via TargetPassConfig::disablePass. The new pass manager optimization pipeline also conditionally adds/doesn’t add passes based on PipelineTuningOptions, so something similar where individual backends can set options in an options struct should work. As far as I can tell

  • AMDGPU disables 7 passes
  • NVPTX disables 11 passes
  • WASM disables 11 passes

Many of these are only disabled because they handle things unsupported by the target, which isn’t a big deal. Another chunk of these are due to passes not handling virtual registers when the target doesn’t have physical registers; these can probably be disabled by a flag in the options struct.

The legacy pass manager allows substitution of arbitrary passes with other passes. In practice, the only thing I see getting substituted is PostRAScheduler with PostMachineScheduler, so that can be handled specially with the options struct.

llc {start,stop}-{before,after} is very hard to get working given the explicit hierarchical nesting of passes. My proposal is to change the way we do codegen testing, moving away from {start,stop}-{before,after} and more toward testing individual passes (or phases) like we do with the optimization pipeline. {start,stop}-{before,after} doesn’t really work with an explicitly hierarchical pipeline since there’s never a linear set of passes you can work with, either at pipeline construction time or at runtime. Given

$ rg -q --stats '(start|stop)-(before|after)' llvm/test
2453 matches
2168 matched lines
1328 files contained matches

this would require a huge amount of tedious work porting tests, including understanding them to figure out what passes (or phases of the pipeline) they’re trying to test. This is probably the biggest change I’m proposing and I’d definitely like feedback on this part of the proposal. This is a huge amount of work, but I think it would put codegen testing in a better state. There may be some way to automate some of it with a script that prints out the legacy pass manager pipeline that’s actually run in the test and converts it to the new PM -passes= syntax. Perhaps the amount of work is not worth it though…

Another tricky thing is adding module passes (e.g. MachineOutliner, GlobalMerge) in the middle of the codegen pipeline, which is mostly machine function passes. We can have a default codegen pipeline of (ignoring the Function → MachineFunction portion) something like

class TargetPassBuilder {
 virtual MachineFunctionPassManager buildPhase1Pipeline(); // e.g. addPreRegAlloc()
 virtual ModulePassManager buildPhase2Pipeline(); // e.g. if we wanted MachineOutliner on all targets by default
 virtual MachineFunctionPassManager buildPhase3Pipeline();
 virtual MachineFunctionPassManager buildPhase4Pipeline();
 virtual ModulePassManager buildGlobalMergePipeline() { return ModulePassManager(); } // may return empty pass manager if no global merge desired
 virtual MachineFunctionPassManager buildPhase5Pipeline();
 virtual ModulePassManager buildPipeline() {
  ModulePassManager MPM;   
  MPM.addPass(createModuleToMachineFunctionAdaptor(buildPhase1Pipeline()));
  MPM.addPass(buildPhase2Pipeline()); // this just takes all passes from phase2() and appends them to MPM
  MachineFunctionPassManager MFPM;
  MFPM.addPass(buildPhase3Pipeline()); // again, passes in phase3/4 are transferred to MFPM
  MFPM.addPass(buildPhase4Pipeline()); // this is important as to not have two separate machine function pass managers between phase3/4
  MPM.addPass(createModuleToMachineFunctionAdaptor(MFPM));
  ModulePassManager GlobalMergePM = buildGlobalMergePipeline();
  // don't split phase3+4 and phase5 if we don't actually add a module pass
  if (!GlobalMergePM.is_empty()) {
   MPM.addPass(createModuleToMachineFunctionAdaptor(MFPM));
   MFPM = MachineFunctionPassManager();
   MPM.addPass(GlobalMergePM);
  }
  MFPM.addPass(buildPhase5Pipeline());
  MPM.addPass(createModuleToMachineFunctionAdaptor(MFPM));
  return MPM;
 }
};

and if a target wants to add its own module pass, it can override buildPipeline(), basically copying TargetPassBuilder::buildPipeline but with more module passes added between phases.

In summary, I think we should move away from the current in-tree legacy-pass-manager-like implicit nesting scheduling to a more new pass manager optimization pipeline explicit nesting model. The number of pipeline changes that current in-tree backends make to the shared codegen pipeline seems small enough to do this. The main cleanup if we push forward with this design involves cleaning up tests to specify certain passes to run, as opposed to llc {start,stop}-{before,after}, which is on the order of thousands of tests. Other ideas to make this process less painful are very welcome.

Feedback greatly appreciated, especially from people who have worked with the codegen pipeline.

CC @paperchalice

7 Likes

I’ve always found the insertPass/disablePass interface extremely confusing. We have 2 different coexisting paradigms for how to build the pass pipeline. I would prefer we consolidate on the explicit hooks to add passes.

This is another forgotten migration project to use misched. We should have deleted PostRAScheduler years ago.

The codegen pipeline does have more or less a linear set of passes. You have much less freedom in reordering than in the IR.

This makes me a little bit nervous, and I think it will make codegen development work more difficult, particularly for newcomers. Unlike the IR where there’s an expectation of modularity, the later you get in codegen (particularly in the regalloc pipeline), the machine passes have more of a spaghetti relationship. For example, the process that should be thought of as “register allocation” is really split across > 6 passes that are somewhat tightly coupled. The properties the MIR is expected to have goes through several phase changes, so the ordering of passes is much more significant than in the IR. Given the set of existing bugs we already have, I think this will require a decent amount of general bug fixing work to make feasible. It will also be an additional headache whenever making changes to the structure of the codegen pass pipeline, or when doing any debugging. I routinely run into issues that require running much of the mid-late codegen passes. It will be a hassle if I have to remember the exact set of passes between those points to construct a reproducer command

Further, there are 2 different use cases for -start/stop before/after. The first is running some subset of passes in the pipeline, with a MIR input. I typically use this for running the meta-RA pipeline. This type of use case is more tractable for migrating to an explicit pass list, but is probably the minority of tests.

The second type of use case is starting from the original IR and getting to a specific point in the pipeline, crossing the transition from IR to MIR. I would expect the majority of tests using -stop-after are of this form, and this is pretty fundamental to day to day debugging. Losing this sounds very unpleasant. The first step of debugging anything in a MIR pass is -stop-before=the-broken-pass (or unfortunately often, a point several passes prior)

3 Likes

Maybe we could construct CodeGen IR pass pipeline like optimization pipeline, handle nesting explicitly and support --{start,stop}-{before,after} in machine function pass pipeline part as a compromise, because machine function pass pipeline itself is already order sensitive.

To further add on this: in some cases we do not have a way to load the representation reliably, think about SelectionDAGs. Also, the MIR serialization / deserialization might have some blind spots, so rewriting the tests to test individual passes might require fix it as well, which might be a separate task on its own.

Noticed PassManager::addPass has an overload for pass manager to merge two same type pass managers, I’m wondering if we could do something like:

class TargetPassBuilder {
protected:
  // Or shadow `PassManager`s and related `createXXXPassAdaptor`s
  class ModulePassManagerWrapper {
  public:
    template<typename PassT>
    std::enable_if_t<isNotModulePassManagerWrapper>
    addPass(PassT &&P) {
      static_assert(/* not pass manager or adaptor */);
      PassNames.push_back(PassT::name());
      PassManagers.push_back(ModulePassManager()
        .addPass(std::forward<PassT>(P)));
    }
    template<typename PassT>
    std::enable_if_t<isModulePassManagerWrapper>
    addPass(PassT &&P) { /* merge two wrapper */}
    void addFunctionPassManagerWrapper(FunctionPassManagerWrapper &&FPMW) {...}
  private:
    std::vector<StringRef> PassNames;
    std::vector<std::variant<ModulePassManager, FunctionPassManager,
        LoopPassManager, MachineFunctionPassManager>> PassManagers;
  };
  ModulePassManagerWrapper MainWrapper;
public:
  ModulePassManager buildPipeline() {
    ModulePassManagerWrapper MPMW;
    // Do pipeline construction.
    ...
    // Filter passes with MPMW because we have all pass names in order.
    // Construct PassManager from MPMW manually,
    // pass manager will merge pass manager automatically in `addPass`.
  }
};

Users must do:

void TargetPassBuilder::buildSomePart() {
  ModulePassManagerWrapper MPMW;
  MPMW.addPass(Pass1);
  MPMW.addPass(Pass2);
  FunctionPassManagerWrapper FPMW;
  FPMW.addPass(Pass1);
  FPMW.addPass(Pass1);
  MPWM.addFunctionPassWrapper(std::move(FPMW));
  // Same for LoopPassManagerWrapper and MachineFunctionPassManagerWrapper.
  MainWrapper.addPass(std::move(MPMW));
}

Then we can both filter the pipeline and force user concerning pass nesting explicitly.


We need a very flexible way to extend pass pipeline, we may introduce many extension points to emulate virtual functions in TargetPassConfig. I considered methods like inject{Before,After}<SomePass>(...) but it seems a bit too flexible.


Miscellaneous:
Currently codegen pipeline use its own optimization level implementation and it is overlapped with "llvm/Passes/OptimizationLevel.h", should we standardize on using OptimizationLevel?

Ping @aeubanks