Status of the New Pass Manager

Hi all,

When writing new passes we currently have two pass manager interfaces to consider. In many cases we resort to implementing both interfaces because the NPM is not enabled by default while we want our pass to be enabled by default.

It is my understanding that the new pass manager interface is going to replace the old one soon. However, it is not clear to me if anyone is actively working on it and if so what is the timeline. Does anyone have more information about that?

I was also wondering if any of the buildbots test with the NPM pipeline?

Regards,

Bardia Mahjour
Compiler Optimizations
IBM Toronto Software Lab
bmahjour@ca.ibm.com (905) 413-2336

Here at Fuchsia we would eventually like to switch to the new PM by default. We have slowly been porting relevant passes from the legacy PM. AFAIK there hasn’t been a dedicated timeline, but one of our goals is to get an upstream new PM buildbot running.

We were going to make a bot once all tests pass when the new PM is enabled by default. I’ve been addressing these tests over the past couple of weeks. Currently there are only 4 tests that fail under new PM and are being addressed in https://reviews.llvm.org/D63174. Ideally when this is committed (and assuming no other tests fail), we can get an upstream bot.

  • Leonard

For our downstream usage, we’ve switched entirely to the new pass manager. We made the switch a couple of months ago. All of our testing is being done with the NPM, and we’re about to start deleting (downstream) code which was only needed by the legacy pass manager.

I believe several other major contributors are in the same state. We really need to get upstream switched over so that all of the community’s testing efforts are aligned again.

Philip

I hadn't realised it was so close to being ready. Do you see this as a
switch that could be made before 9.0, or after it?

Best,

Alex

The Android platform build (AOSP) has also switched to the new pass manager recently. We do have a few bugs that we are chasing (hence opt-outs), but it is working quite well otherwise.

Our current list of issues:

  1. Libsqlite still has a mysterious failure that we haven’t been able to reduce well.
  2. https://bugs.llvm.org/show_bug.cgi?id=42124 shows that inlining costs are a bit different under NPM. https://reviews.llvm.org/D63034 is one proposed patch for addressing this.
  3. libpdfium exposed a non-determinism issue with NPM where having the linux-libc-dev system package installed changes execution. We are still looking at why this happens.
  4. Sanitizer coverage information isn’t supported by the NPM yet (https://reviews.llvm.org/D62888).

Thanks,
Steve

FWIW, the flags like -print-after, -printer-after-all don't work well
with the new pass manager last time I checked.

They don't, but this isn't considered a blocker to removing the old
one as far as I know.

-eric

-print-after-all is very useful for debugging and learning about LLVM. I would hope that would be implemented for the new PM before removing the old PM. I'd personally consider it a blocker.

-Troy

Printing was implemented in r342896.
@Hiroshi: Are there specific issues or limitations you encountered with it?

Cheers,
Philip

I don't exactly remember when I last tried it and I didn't realize
there was r342896. I'll check it out. Thanks.

Update: Just landed the sancov port in rL365838. In regards to testing, there’s currently 5 failing unit tests with the new PM enabled. Once we land fixes for those, we can switch unit tests to run with the new PM by default.

FWIW, print--all should work under NPM just fine and the only problem with print- is (absent) uniform pass name processing for cl::opt.
It is easy to introduce yet another option that takes NPM pass names (and that’s what we actually did downstream).
Any suggestions on how to resolve this nuisance are welcome.

regards,
Fedor.

чт, 11 июл. 2019 г., 18:53 Hiroshi Yamauchi via llvm-dev <llvm-dev@lists.llvm.org>:

I had a chance to try -print-after-all with NPM.

It seems like there’s still no output for the passes before objc-arc-contract (which is basically what I saw before.) Does anyone else see this?

Are we talking about the same thing?

*** IR Dump After ObjC ARC contraction ***
*** IR Dump After Pre-ISel Intrinsic Lowering ***
*** IR Dump After Expand Atomic instructions ***
*** IR Dump After Canonicalize natural loops ***
*** IR Dump After Loop Strength Reduction ***
*** IR Dump After Merge contiguous icmps into a memcmp ***
*** IR Dump After Expand memcmp() to load/stores ***
*** IR Dump After Lower Garbage Collection Instructions ***
*** IR Dump After Shadow Stack GC Lowering ***
*** IR Dump After Remove unreachable blocks from the CFG ***
*** IR Dump After Constant Hoisting ***
*** IR Dump After Partially inline calls to library functions ***
*** IR Dump After Instrument function entry/exit with calls to e.g. mcount() (post inlining) ***
*** IR Dump After Scalarize Masked Memory Intrinsics ***
*** IR Dump After Expand reduction intrinsics ***
*** IR Dump After Interleaved Access Pass ***
*** IR Dump After Expand indirectbr instructions ***
*** IR Dump After CodeGen Prepare ***

*** IR Dump After Rewrite Symbols ***
*** IR Dump After Exception handling preparation ***
*** IR Dump After Safe Stack instrumentation pass ***


(dump from the machine passes)

Apparently both yes and no. How do you run this? IR printing is implemented through pass instrumentation and is enabled in e.g. opt through the use of StandardInstrumentation in llvm::runPassPipeline. If you set up NewPM by yourself w/o the use llvm::runPassPipeline then most likely you just do not have StandardInstrumentation installed. regards, Fedor.

Reread your mail/output once more and honestly, I do not understand what happens there. Can you share exact setup where you get this? regards, Fedor.

I basically run “clang -fexperimental-new-pass-manager -print-after-all …”

It’s conceivable that something is different in our setup or in clang (from opt)… I’ll see if I can reproduce it outside our setup.

Thanks.

Does it depend on machine architecture? I generally use x86… regards, Fedor.

I basically run “clang -fexperimental-new-pass-manager -print-after-all …”

It actually needs “-mllvm” in front of “-print-after-all”.

It’s conceivable that something is different in our setup or in clang (from opt)… I’ll see if I can reproduce it outside our setup.

Does it depend on machine architecture?
I generally use x86…

I’m not sure, but this is on x86.

Taewook contributes this patch https://reviews.llvm.org/D65975. It works for me. Thanks!

*** IR Dump After ForceFunctionAttrsPass ***
*** IR Dump After EntryExitInstrumenterPass ***
*** IR Dump After AddDiscriminatorsPass ***
*** IR Dump After InferFunctionAttrsPass ***
*** IR Dump After SimplifyCFGPass ***
*** IR Dump After SROA ***
*** IR Dump After EarlyCSEPass ***
*** IR Dump After LowerExpectIntrinsicPass ***
*** IR Dump After CallSiteSplittingPass ***
*** IR Dump After IPSCCPPass ***
*** IR Dump After CalledValuePropagationPass ***
*** IR Dump After GlobalOptPass ***
*** IR Dump After PromotePass ***
*** IR Dump After DeadArgumentEliminationPass ***
*** IR Dump After InstCombinePass ***
*** IR Dump After SimplifyCFGPass ***
*** IR Dump After RequireAnalysisPass<llvm::GlobalsAA, llvm::Module, llvm::AnalysisManagerllvm::Module> ***
*** IR Dump After RequireAnalysisPass<llvm::ProfileSummaryAnalysis, llvm::Module, llvm::AnalysisManagerllvm::Module> ***
*** IR Dump After InlinerPass *** (scc: (_Z3fooi))
*** IR Dump After PostOrderFunctionAttrsPass *** (scc: (_Z3fooi))
*** IR Dump After ArgumentPromotionPass *** (scc: (_Z3fooi))
*** IR Dump After SROA ***
*** IR Dump After EarlyCSEPass ***
*** IR Dump After SpeculativeExecutionPass ***
*** IR Dump After JumpThreadingPass ***
*** IR Dump After CorrelatedValuePropagationPass ***
*** IR Dump After SimplifyCFGPass ***
*** IR Dump After AggressiveInstCombinePass ***
*** IR Dump After InstCombinePass ***
*** IR Dump After LibCallsShrinkWrapPass ***
*** IR Dump After TailCallElimPass ***
*** IR Dump After SimplifyCFGPass ***
*** IR Dump After ReassociatePass ***
*** IR Dump After RequireAnalysisPass<llvm::OptimizationRemarkEmitterAnalysis, llvm::Function, llvm::AnalysisManagerllvm::Function> ***
*** IR Dump After LoopSimplifyPass ***
*** IR Dump After LCSSAPass ***
*** IR Dump After SimplifyCFGPass ***
*** IR Dump After InstCombinePass ***
*** IR Dump After LoopSimplifyPass ***
*** IR Dump After LCSSAPass ***
*** IR Dump After MergedLoadStoreMotionPass ***
*** IR Dump After GVN ***
*** IR Dump After MemCpyOptPass ***
*** IR Dump After SCCPPass ***
*** IR Dump After BDCEPass ***
*** IR Dump After InstCombinePass ***
*** IR Dump After JumpThreadingPass ***
*** IR Dump After CorrelatedValuePropagationPass ***
*** IR Dump After DSEPass ***
*** IR Dump After LoopSimplifyPass ***
*** IR Dump After LCSSAPass ***
*** IR Dump After ADCEPass ***
*** IR Dump After SimplifyCFGPass ***
*** IR Dump After InstCombinePass ***
*** IR Dump After DevirtSCCRepeatedPass<llvm::PassManager<LazyCallGraph::SCC, llvm::CGSCCAnalysisManager, llvm::LazyCallGraph &, llvm::CGSCCUpdateResult &> > *** (scc: (_Z3fooi))
*** IR Dump After GlobalOptPass ***
*** IR Dump After GlobalDCEPass ***
*** IR Dump After EliminateAvailableExternallyPass ***
*** IR Dump After ReversePostOrderFunctionAttrsPass ***
*** IR Dump After RequireAnalysisPass<llvm::GlobalsAA, llvm::Module, llvm::AnalysisManagerllvm::Module> ***
*** IR Dump After Float2IntPass ***
*** IR Dump After LoopSimplifyPass ***
*** IR Dump After LCSSAPass ***
*** IR Dump After LoopDistributePass ***
*** IR Dump After LoopVectorizePass ***
*** IR Dump After LoopLoadEliminationPass ***
*** IR Dump After InstCombinePass ***
*** IR Dump After SimplifyCFGPass ***
*** IR Dump After SLPVectorizerPass ***
*** IR Dump After InstCombinePass ***
*** IR Dump After LoopUnrollPass ***
*** IR Dump After WarnMissedTransformationsPass ***
*** IR Dump After InstCombinePass ***

*** IR Dump After RequireAnalysisPass<llvm::OptimizationRemarkEmitterAnalysis, llvm::Function, llvm::AnalysisManagerllvm::Function> ***
*** IR Dump After LoopSimplifyPass ***
*** IR Dump After LCSSAPass ***
*** IR Dump After AlignmentFromAssumptionsPass ***
*** IR Dump After LoopSinkPass ***
*** IR Dump After InstSimplifyPass ***
*** IR Dump After DivRemPairsPass ***
*** IR Dump After SimplifyCFGPass ***
*** IR Dump After SpeculateAroundPHIsPass ***
*** IR Dump After CGProfilePass ***
*** IR Dump After GlobalDCEPass ***
*** IR Dump After ConstantMergePass ***
*** IR Dump After ObjC ARC contraction ***
*** IR Dump After Pre-ISel Intrinsic Lowering ***
*** IR Dump After Expand Atomic instructions ***
*** IR Dump After Module Verifier ***
*** IR Dump After Canonicalize natural loops ***
*** IR Dump After Merge contiguous icmps into a memcmp ***
*** IR Dump After Expand memcmp() to load/stores ***
*** IR Dump After Lower Garbage Collection Instructions ***
*** IR Dump After Shadow Stack GC Lowering ***
*** IR Dump After Remove unreachable blocks from the CFG ***
*** IR Dump After Constant Hoisting ***
*** IR Dump After Partially inline calls to library functions ***
*** IR Dump After Instrument function entry/exit with calls to e.g. mcount() (post inlining) ***
*** IR Dump After Scalarize Masked Memory Intrinsics ***
*** IR Dump After Expand reduction intrinsics ***
*** IR Dump After Interleaved Access Pass ***
*** IR Dump After Expand indirectbr instructions ***
*** IR Dump After CodeGen Prepare ***
*** IR Dump After Rewrite Symbols ***
*** IR Dump After Exception handling preparation ***
*** IR Dump After Safe Stack instrumentation pass ***
*** IR Dump After Module Verifier ***

*** IR Dump After Finalize ISel and expand pseudo-instructions ***:

*** IR Dump After X86 Domain Reassignment Pass ***:

*** IR Dump After Early Tail Duplication ***:

*** IR Dump After Optimize machine instruction PHIs ***:

*** IR Dump After Slot index numbering ***:

*** IR Dump After Merge disjoint stack slots ***:

*** IR Dump After Local Stack Slot Allocation ***:

*** IR Dump After Remove dead machine instructions ***:

*** IR Dump After Early If-Conversion ***:

*** IR Dump After Machine InstCombiner ***:

*** IR Dump After X86 cmov Conversion ***:

*** IR Dump After Early Machine Loop Invariant Code Motion ***:

*** IR Dump After Machine Common Subexpression Elimination ***:

*** IR Dump After Machine code sinking ***:

*** IR Dump After Peephole Optimizations ***:

*** IR Dump After Remove dead machine instructions ***:

*** IR Dump After Live Range Shrink ***:

*** IR Dump After X86 Optimize Call Frame ***:

*** IR Dump After X86 Avoid Store Forwarding Blocks ***:

*** IR Dump After X86 speculative load hardening ***:

*** IR Dump After X86 EFLAGS copy lowering ***:

*** IR Dump After Detect Dead Lanes ***:

*** IR Dump After Process Implicit Definitions ***:

*** IR Dump After Remove unreachable machine basic blocks ***:

*** IR Dump After Live Variable Analysis ***:

*** IR Dump After Eliminate PHI nodes for register allocation ***:

*** IR Dump After Two-Address instruction pass ***:

*** IR Dump After Slot index numbering ***:

*** IR Dump After Live Interval Analysis ***:

*** IR Dump After Simple Register Coalescing ***:

*** IR Dump After Rename Disconnected Subregister Components ***:

*** IR Dump After Machine Instruction Scheduler ***:

*** IR Dump After Debug Variable Analysis ***:

*** IR Dump After Live Stack Slot Analysis ***:

*** IR Dump After Virtual Register Map ***:

*** IR Dump After Live Register Matrix ***:

*** IR Dump After Greedy Register Allocator ***:

*** IR Dump After Virtual Register Rewriter ***:

*** IR Dump After Stack Slot Coloring ***:

*** IR Dump After Machine Copy Propagation Pass ***:

*** IR Dump After Machine Loop Invariant Code Motion ***:

*** IR Dump After X86 FP Stackifier ***:

*** IR Dump After PostRA Machine Sink ***:

*** IR Dump After Shrink Wrapping analysis ***:

*** IR Dump After Prologue/Epilogue Insertion & Frame Finalization ***:

*** IR Dump After Control Flow Optimizer ***:

*** IR Dump After Tail Duplication ***:

*** IR Dump After Machine Copy Propagation Pass ***:

*** IR Dump After Post-RA pseudo instruction expansion pass ***:

*** IR Dump After X86 pseudo instruction expansion pass ***:

*** IR Dump After Post RA top-down list latency scheduler ***:

*** IR Dump After Analyze Machine Code For Garbage Collection ***:

*** IR Dump After Branch Probability Basic Block Placement ***:

*** IR Dump After X86 Execution Dependency Fix ***:

*** IR Dump After BreakFalseDeps ***:

*** IR Dump After X86 Byte/Word Instruction Fixup ***:

*** IR Dump After X86 LEA Fixup ***:

*** IR Dump After Compressing EVEX instrs to VEX encoding when possible ***:

*** IR Dump After Contiguously Lay Out Funclets ***:

*** IR Dump After StackMap Liveness Analysis ***:

*** IR Dump After Live DEBUG_VALUE analysis ***:

*** IR Dump After Insert fentry calls ***:

*** IR Dump After Insert XRay ops ***:

*** IR Dump After Implement the ‘patchable-function’ attribute ***:

*** IR Dump After Check CFA info and insert CFI instructions if needed ***:

Yes, this is a proper fix for the problem, apparently clang was not setting up PassInstrumentation.
As soon as this patch is landed, -print-before/after-all should work fully in clang.
And -time-passes will need a bit of polish (to avoid duplicated reports).

Thanks for bringing this issue up!

Fedor.