[RFC] New Pass Manager: pipeline and infrastructure extensions [LLVM Dev round table follow up]

Hello all,

This is a follow up of the roundtable discussion we had at this year’s LLVM Dev meeting, regarding two topics: the new pass manager pipeline and extending the new pass manager infrastructure.
Many thanks to all who attended!!

The first reason I proposed the roundtable are the many phase ordering issues that come up, where most of the time the proposed solution is adding another pass in the pipeline. I’d like us to document the current state and find a way to systematically address these.

The second reason is the instances in the pipeline where we rely on workarounds to express pass dependencies which raised the question if the current infrastructure should be extended to address this need.

Details on the two topics:

  1. New pass manager pipeline

The current pass pipeline was defined based on historic data and its structure is not currently documented and neither are the decisions, patches or motivating regressions that are going into changing it.

The proposal here is asking for the community’s help in documenting the current state and how pipeline decisions were made. The suggestion is to start with a shared public doc, linking to patches or discussions where available, or sharing the thought process that went into making a decision. The expectation is there will be gaps of knowledge, but having partial information will still be better than the current state.

The shared document should eventually become part of the documentation. A pertinent issue raised here was keeping the documentation up to date.

Some aspects discussed at the roundtable:

  • Canonicalization passes; passes that may de-canonicalize (LoopSimplifyCFG given as example)
  • Simplification” and “optimization” pipelines that compose the default optimization (as opposed to codegen) pass pipeline.
  • Interleaving with inlining
  • Light weight (cheap, short compile, inserted multiple times) vs heavy weight (expensive, high compile time, use once) passes
  • Codegen pipeline is out of scope

Based on the documentation, the next step will be finding patterns, sequences and pass orderings with good compile time that also lead to runtime performance. Ideally this exploration leads to discovering missed opportunities.

Finally, the hardest step will be defining a policy or process for making changes to the pass pipeline.

  1. New pass manager infrastructure extensions

The two constructs that seem to address the recurring patterns in the pass pipeline are:

  • Conditionally running of a pass.
  • Iterative run of a sequence of passes until a condition is met.

The consensus was these features would be welcome. I’d like to get broader feedback if this is something worth pursuing.

For both constructs, the condition should be able to use the state (Changed) of other passes in the pipeline as input. Additional conditions are to be discussed if this proposal moves forward to implementation design. A potential issue I brought up was debuggability if a partial pipeline iteration count changes based on a dynamic condition.

Sample examples motivating the two topics (this is not an exhaustive list):

  1. New pass manager pipeline
  1. New pass manager infrastructure extensions: adding conditionals and iteration in the pass pipeline
  • Extra vectorization passes are currently run using a fake analysis (⚙ D115052 [Passes] Only run extra vector passes if loops have been vectorized.) to translate the dynamic condition: loop vectorization ran and made a change. If the vectorization pass made a change, then it retrieves the fake analysis. In the pass manager, we define a special FunctionPassManager (ExtraVectorPassManager), whose sole purpose is to check if the fake analysis is cached (i.e. vectorization made a change). If so, it adds the additional passes defined in the main pipeline, otherwise, it is a no-op. An infrastructure for conditionally running passes, would address this use case.
  • In: ⚙ D100780 [Passes] Add extra LoopSimplifyCFG run after IndVarSimplify., the goal was to run LoopSimplifyCFGPass if IndVarSimplify made a change. Since this is a more isolated case, defining a fake analysis and a special function pass manager, like for extra vectorization passes, was too elaborate. The alternative solution suggested in the patch is through utilities, invoked directly from the first pass. This is a mechanism used often when functionality is shared between multiple passes (e.g. ConstantFolding).
  • Iteration to fixed point is done inside some passes (e.g. SimplfyCFG and InstCombine). In other cases a pass is simply run multiple times in the pipeline (example). Iteration available in the pass manager itself would open the door for exploring resolving phase ordering issues through iteration to fixed point or a dynamic condition.

Looking forward to your feedback,

1 Like

Has there been discussion on how to achieve this? For conditional running of a pass, I suppose you could have a generic IfChangeThen<IRUnitT> pass that takes two passes as arguments. Is that the direction you’re thinking in?

A more general approach would be that passes could return something.

From primitive results: I made no changes or I barely changed the IR.

The LV reported: it was a great day and I could build an almost perfect outer loop vectorisation.

We haven’t dived in the how yet. First milestone is figuring out how important having this feature is.

Is there enough need to invest into building a more complex mechanism (like your suggestion, or also add a predicate argument and call it IfPredicateThen, or something else entirely), or do we continue with work-arounds (either through analyses, or internal to each pass and not visible to the pass manager), or do we expose something as simple as “Changed” as the pass return and nothing else?

If the consensus is the generality is desired/needed, then we can go in depth into implementation options. But it’s beyond the scope of the current RFC.

+1 on both the document of the current pipeline and new features for the pass manager.