As efforts to remove the legacy pass manager for the optimization pipeline progress, there’s the question of what to do with the “legacy” opt pass syntax.
Historically we’ve used something like opt -instcombine to run instcombine. Currently this relies on the global registry of legacy passes. We then roughly translate this to opt -passes=instcombine.
At some point we’re going to remove legacy IR optimization pass entries from the global registry because we’ll delete the legacy pass. Without any other changes, this will mean that opt -instcombine won’t work. However, some people have expressed that for simple cases this syntax is shorter and nicer.
When I started working on LLVM, something like opt -instcombine -instcombine-foo-flag was confusing to me because it was unclear that -instcombine added a pass to the pipeline, and it’s hard at a quick glance to see which command line options are passes and which are flags. So my first instinct with the new pass manager transition was to deprecate and remove this sort of syntax in favor of always specifying opt -passes=instcombine. I haven’t really minded typing an extra -passes=. But as previously mentioned, some people have objected saying they like the legacy syntax.
There are a couple of options I can see:
Force people to use the -passes syntax after removing legacy passes
Same as 1) but we add a -p alias to -passes so it’s fewer keystrokes (I like this option the most, opt -instcombine -gvn becomes opt -p instcombine,gvn)
Collect unknown flags via a cl::opt(cl::Sink) (cl::Sink appears to be unused within the llvm-project codebase but a quick prototype suggests it works) and do the current legacy syntax → -passes translation that we already do
this can sometimes lead to unexpected results with pass ordering, e.g. -instcombine -gvn won’t run instcombine/gvn on a function, then go to the next function, etc, but rather it’ll run instcombine on every function, then gvn on every function. this is more noticeable with CGSCC/function pass interleaving. but this is already an issue with the syntax translation, so
this also can make error reporting with misspelled flags more confusing
3) but restrict the number of passes to at most one pass, alleviating the first concern in 3)
3) but restrict the passes to be of the same type (e.g. all passes must be function passes) and put them in a proper pipeline
I’d like to hear people’s thoughts on what they’d like to see.
I feel the option 2 is the most friendly option. And it is indeed confusing to see opt -pass-name if you don’t know pass-name is a pass. It requires some background knowledge and loses some readability. Also I feel like we (the llvm community) don’t pursue the backward compatibility if we had better choice.
Option 1 or 2 sound most reasonable to me. Anything else probably just adds an extra layer of logic to make it possible to do the same thing in multiple ways, at some cost of having to support multiple syntaxes.
Specially the passes that nowadays, with new-pm, take arguments etc. They kind of need the -option 'string' syntax to allow for using ‘(’, ‘)’, ‘<’ and ‘>’ as part of the pass name. Or else we need to find some new syntax to specify those things.
Btw, I was a bit skeptical at first, but I’ve realized that the -passes= way of specifying the pipeline/passes to run is quite nice. Now I can find which passes that are used by looking at a single command line option. With old pass manager the list of passes being executed could be scattered out over the command line and you never really knew which options that referred to a pass or something else (and you never knew exactly in which order the passes would be executed). IMO life got much simpler with -passes.
One of the key selling points of the new PM has been the simplicity of defining pass pipelines. And that’s what the --passes=<> option manifests. It does mean typing a few more characters when compared to using e.g. -instcombine, but IMHO that’s a low price to pay for the explicitness.
Option 2 and option 1 (in that order) are my votes.
Also, I feel like I’m missing something, but are new passes able to use arguments from cl::opt now? I was under the impression this was still an issue. It’s possible I’m thinking strictly of pass plugins, however…
My thoughts for what they are worth. Option 1/2, but allow 4 or 5 as a depreciated option, generating a warning showing a “fix” with the translated options for a couple versions. Would make it easier to users of the old syntax to migrate.
I have a some test passes for my compiler that run multiple versions of llvm against a same-ish set of byte code. I ended up doing something similar to #4 in a wrapper around opt. The error message when using options like -O2 with legacy pass options was a bit misleading and almost had me mishandle the first concern listed in option 3.
When I talked about arguments I was thinking about the pass parameters. With new PM some passes can take parameters. For example simplifycfg can be configured using several parameters already when creating the pass:
Another example is loop-unroll. An -O3 pipeline is using loop-unroll<O3> while an -O2 pipeline is using loop-unroll<O2>. With the legacy syntax one would need to run “opt -loop-unroll” plus possibly some extra arguments (?) to hopefully get the full pipeline configuration of the pass when bugpointing/reducing a failure in loop-unroll. With the new PM this kind of params that are set by the default pipeline can be set also when running a custom pipeline.
With new PM you can actually run a test using a pipeline such as
opt -passes='loop-unroll<O2>,loop-unroll<O3>
i.e. using different setting for the same pass at two different places in the pipeline. Afaict that wasn’t possible with legacy PM unless registering the pass with different names for the different specializations.