[RFC][NewPM] Migrating tests from `opt -foo` to `opt -passes=foo`

Now that the new PM has been the default for the optimization pipeline for a while, I’d like to start thinking about next steps. Not quite deprecating the legacy PM for the optimization pipeline yet, since we should wait for the next LLVM release. But one thing we can start doing is cleaning up lit tests that use opt.

There’s been confusion over exactly how the -passes= parsing works, so I’ve updated https://llvm.org/docs/NewPassManager.html (or rather just the sources, seems like the webpage hasn’t updated yet) with some details.

For background, opt -foo is currently already translated to opt -passes=foo when the new PM is on (which is true by default).

I imagine this would be a short script that has a list of passes from PassRegistry.def and the IR unit they operate on, and looks through RUN lines with opt, deleting existing pass arguments and replacing them with a -passes= argument. If we have more than one pass, each pass would be wrapped in the proper adaptor. For example, opt -instcombine -globaldce becomes opt -passes='function(instcombine),globaldce'.

We need to make sure that we don’t end up causing too many duplicate RUN lines if we have tests that specify both opt -foo and opt -passes=foo.

We should wait until the next LLVM release before changing all opt tests since we do want any bots using the legacy PM to still run opt tests against the legacy PM. But until then we can pick a couple lesser-used passes/test directories to touch up.

Note that this does not include any opt tests that test passes on available under the legacy PM, which would be IR passes in the codegen pipeline.

Can I ask a basic question? Given we’ve separated the interface from the choice of pass manager, why are we removing the “-pass” interface at all? Is there some non-trivial amount of maintenance needed to keep that working?

Personally, I find the -pass-name interface much easier to work with for simple adhoc testing than the -passes version.

I can definitely see the case for dropping testing of old-pm + pass combinations, but at this point, that’s less about the command line and more about removing -enable-new-pm=0 tests. There’s also maybe an argument that we should spell the command line only one way in testing, but that’s different than removing the command line parsing.

Philip

From: llvm-dev <llvm-dev-bounces@lists.llvm.org> On Behalf Of Arthur
Eubanks via llvm-dev
Sent: den 29 juni 2021 03:34
To: llvm-dev <llvm-dev@lists.llvm.org>
Subject: [llvm-dev] [RFC][NewPM] Migrating tests from `opt -foo` to `opt -
passes=foo`

Now that the new PM has been the default for the optimization pipeline for
a while, I'd like to start thinking about next steps. Not quite deprecating
the legacy PM for the optimization pipeline yet, since we should wait for
the next LLVM release. But one thing we can start doing is cleaning up lit
tests that use opt.

There's been confusion over exactly how the -passes= parsing works, so I've
updated Using the New Pass Manager — LLVM 18.0.0git documentation (or rather just the
sources, seems like the webpage hasn't updated yet) with some details.

For background, `opt -foo` is currently already translated to `opt -
passes=foo` when the new PM is on (which is true by default).

I imagine this would be a short script that has a list of passes from
PassRegistry.def and the IR unit they operate on, and looks through RUN
lines with opt, deleting existing pass arguments and replacing them with a
-passes= argument. If we have more than one pass, each pass would be
wrapped in the proper adaptor. For example, `opt -instcombine -globaldce`
becomes `opt -passes='function(instcombine),globaldce'`.

For single pass tests I guess this will be quite simple. But isn't it a
bit more complicated for tests running multiple passes?

If I understand correctly these are kind of equivalent to each other:

   opt -enable-new-pm=0 -function-pass-1 -function-pass-2
   opt -passes='function-pass-1,function-pass-2'
   opt -passes='function(function-pass-1,function-pass-2)'

and these are also equivalent to each other

   opt -enable-new-pm=1 -function-pass-1 -function-pass-2
   opt -passes='function(function-pass-1),function(function-pass-2)'

However, there are at least some subtle differences between the two
groups. Although I'm not sure exactly if that is a big problem. I have
some vague memories of some test case where it made a difference, but
I don't remember exactly what test it was.

Perhaps it might impact preservation/caching of analyses passes if going
in/out of pass managers or changing the order in which passes are applied
like that ("pass-1 on all functions and then pass-2 on all functions"
vs "all passes on function-1 and then all passes on function-2")?

I got a feeling that most tests involving more than one pass, and that
are regression testing some bugfix, are using multiple passes for a reason.
And then it might be a bit more difficult than using a script to rewrite
RUN lines if one wants to make sure that the test case still is valid
in the sense of reproducing/validating the original problem.

With that said. It might be that the current flip of making -enable-new-pm=1
the default messed up some tests (given that the examples involving
-enable-new-pm=0 and -enable-new-pm=1 above are in different groups). In
some sense we might have been "saved" by still having bots running with
the legacy PM. I do think it often would be more correct to replace
  opt -function-pass-1 -function-pass-2
with
  opt -passes='function-pass-1,function-pass-2'
rather than
  opt -passes='function(function-pass-1),function(function-pass-2)'

Yes you’re correct, -passes=func1,func2 is more correct than -passes=function(func1),function(func2) if we want to preserve the exact legacy PM interleaving order.

For just function passes it shouldn’t matter too much. For interleaved cgscc and function passes it makes more of a difference.

Perhaps we manually update RUN lines with multiple passes that don’t run on the same IR unit as long as there aren’t too many. And for the rest automatically use -passes=pass1,pass2.

When I was making all the tests work with the new PM, there were a couple cases where I had to add -enable-new-pm=0 to the existing RUN line and add a new RUN line with the proper nesting.

opt -pass relies on the global registry of legacy passes, which we should really get rid of at some point. It might be possible to do something similar with a cl::opt that uses the new PM PassBuilder parsing, but that would end up being super hacky. opt -pass doesn’t really fit the design of the new PM.

IMO opt -passes=foo is much clearer than opt -foo and is not much longer to type (it’s what I use now, even for simple things). I remember when I started working on LLVM, the opt -foo syntax was confusing, especially when there were multiple passes and they were interleaved on the command line with other flags.

I’ve created a patch to show-case the intended migration suggested by this RFC: https://reviews.llvm.org/D108472

Although, I think the timeline that has been discussed was that we should send out some kind of deprecation announcement before doing this. And maybe also remove the ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER cmake option first.

/Björn