[RFC] Removing bugpoint (as part of new pass manager migration)

As brought up a while back in [RFC] Deprecating the legacy pass manager for the optimization pipeline, we’re currently trying to remove usage of the legacy pass manager for the optimization pipeline.

Bugpoint is built around hacking around an instance of a legacy pass manager, so as part of the legacy pass manager deprecation it needs to be removed or reworked. I doubt anybody is going to rework it, so I’d like to remove it.

Are there use cases that people use bugpoint for that other tools don’t cover?
llvm/utils/reduce_pipeline.py reduces a pipeline using the new pass manager given a reproducer script.
llvm-reduce reduces IR given a reproducer script. There’s also a mode for reducing MIR that I haven’t tried.
OptBisect is useful for finding miscompiles.

I’ve been using all of the above to debug various issues and they’ve worked well for me.

I plan on updating How to submit an LLVM bug report — LLVM 15.0.0git documentation to remove references to bugpoint and mention alternatives if this is ok.

I have used bugpoint multiple times in the past few years for complex miscompiles downstream. None of those match the rich functionality bugpoint offers; I want to be able to bisect the pipeline and the IR all automatically. So I’d be strongly against its removal if there is no alternative that can match it; it’d make my life significantly harder when I have to debug miscompiles of complex C++ code yet again.

1 Like

I am generally very happy with llvm-reduce when it comes to single-pass reductions. It’s reduction algorithm tends to be much faster than bugpoint, and produce better reductions.

However, I think our end-to-end reduction experience with reduce_pipeline.py isn’t great right now. The pipeline reduction itself isn’t very reliable and usually requires some manual pipeline cleanup afterwards. And then you need to manually take the pipeline reduction output and feed it into llvm-reduce to perform the IR reduction.

I think it would be worthwhile to invest some additional effort into improving reduce_pipeline.py, as well as having some kind of end-to-end reduction script that has the same ease of usage that bugpoint -O3 test.ll used to have.

All that being said, I haven’t actually used bugpoint in a long time now, so dropping it wouldn’t affect me. I just don’t think our alternatives are quite as good for end-to-end reduction, where usability is concerned.

I’m still using both bugpoint and llvm-reduce. I typically run a new failure through both. llvm-reduce is still missing some reduction techniques and produces a somewhat different looking results

I use both and sometimes need to run both alternatively to get a nice result. It perhaps reinforces other folks’ points that llvm-reduce lacks some reduction techniques.

The combo of llvm-reduce and reduce_pipeline.py should do that, is it just the convenience of putting it all together, or is there something more fundamental missing? I can come up with a script that wraps the two.

Any specific examples where reduce_pipeline.py is lacking? I know I’ve seen some minor issues like -passes=function<eager-inv>(dse) rather than -passes=function(dse) or just -passes=dse.

Any examples/suggestions? I’m happy to add some more things to llvm-reduce.

Does that also bisect which function is being miscompiled and even do basic block bisecting/extraction (never got that to work in bugpoint, but in theory it does that)?

One issue I had in mind is discussed in ⚙ D117184 Fix failing assertion in SimplifyCFG.cpp as icmp gep fold to constant expression and following. But yeah, unnecessary wrapper passes being retained is another one (devirt being common as well).

I can’t tell you specifically what’s is missing, but llvm-reduce output tends to leave behind more control flow I’m able to delete later

Another small complaint I have about llvm-reduce is it’s progression output is worse than bugpoint. Bugpoint prints a status including the number of instructions left to reduce. Comparatively, llvm-reduce prints a lot of noise about invalid IR it produced during the reduction process. I also think it would be better if it tried a little harder to avoid trying the invalid cases in the first place. Most of these are simple to avoid violations.

For a real world example, today I ran into a case where bugpoint managed to take a testcase further. The original reproducer through llvm-reduce gave me a 565 line IR output. Feeding that back into bugpoint brought it down to 324 lines. I’d file a bug, but the reproducer requires a specific build of clang and I’m not sure it’s easily sharable.

Filed Example where bugpoint does a better job than llvm-reduce · Issue #54882 · llvm/llvm-project · GitHub though it may be of limited use without the reproducer

With both bugpoint and llvm-reduce, I’ve found running -instsimplify after helps too. We probably could use a reducer pass that just calls SimplifyInstruction

Bugpoint is built around hacking around an instance of a legacy pass manager, so as part of the legacy pass manager deprecation it needs to be removed or reworked. I doubt anybody is going to rework it, so I’d like to remove it.

This seems built on a premise that it would be hard to rework/refactor bugpoint to build on the new pass manager. Is this true? Can you explain why?

1 Like

Having used (and extended) bugpoint many times in the past I would also like to understand what’s preventing us from porting bugpoint to the new pass manager.

// Hack to capture a pass list.
namespace {
class AddToDriver : public legacy::FunctionPassManager {
  BugDriver &D;

  AddToDriver(BugDriver &_D) : FunctionPassManager(nullptr), D(_D) {}

  void add(Pass *P) override {
    const void *ID = P->getPassID();
    const PassInfo *PI = PassRegistry::getPassRegistry()->getPassInfo(ID);

bugpoint reduces the list of passes by creating a subclass of FunctionPassManager and capturing whatever passes the legacy PassManagerBuilder adds for something like -O2. It does this via C++ inheritance which the new PM explicitly avoids. It also goes through a global registry of passes which the new pass manager explicitly avoids using.

llvm/utils/reduce_pipeline.py instead works with the textual representation of the list of passes that we pass to the new pass manager builder. Currently it assumes that we’re working with a crash but could be extended to work with miscompiles.

I suppose we could replace the existing pass list reduction by essentially reimplementing reduce_pipeline.py within bugpoint.

This makes sense to me. From what you’ve described, it sounds pretty straightforward compared to replicating bugpoint’s reduction functionality (e.g., debug info IR reduction) in python.