Multiple verifier runs before (or in the beginning of) opt pipeline

When running opt the module verifier is running up to three times before any other pass, when just doing opt foo.bc. And for opt -passes=verify foo.bc it could end up running the verifier four times.

Not sure if this is a big problem. But IMO we seem to waste some cycles doing redundant verifications.

Here are the more implicit runs of the verifier (partially controlled by -disable-verify):

  1. When parsing the ll/bc file and it contains debug info the call to llvm::UpgradeDebugInfo (for AutoUpdate) may result in a call to verifyModule.
    This is only done if there is debug info with correct version. But it is done even if using -disable-verify.
    If verification fails here, then we get LLVM ERROR and a stack trace.

  2. Unless -disable-verify is used an explicit verifyModule call is made before starting any pass manager pipeline. According to code comments this was done for legacy PM to find failures before doInitialization().
    This invocation is currently done for new PM as well as legacy PM.
    If the verification fails here, we get a nice error message about broken input module.

  3. For new PM, when not using -disable-verify, the verify pass is injected at the beginning of the pipeline.
    If the verification fails during pipeline execution we get an LLVM ERROR and a stack trace.

Afaict the verification in new PM, from (3) above, is totally redundant. If the input module is broken we would end with a verification error already from (2) or (1).

If we want to skip (2) for new PM, then I think we would need to update some lit tests. Since I think “RUN: not opt -verify” kind of looks for exit status 1 and does not expect the exit status indicating a crash.

We can probably remove (3) as it is redundant. A difference would be that we do not run the verifier as a first pass. So for example “opt -print-after-all -O2 foo.ll” would no longer dump the IR after that initial verify pass. I.e. the log would not include the input IR.

Moreover, a test such as test/Verifier/opaque-ptr.ll is having this RUN line:

; RUN: opt -passes=verify -opaque-pointers -S < %s | FileCheck %s

Since not using -disable-verify, we first get the two automatically added verifier runs from (1) and (2) above. And then we explicitly ask for a third run of the verifier. So we end up running the verifier three times, and that is all we do in the test.
I’m not sure if we should assume that verifier is run implicitly and remove the explicit -passes=verify, or if we should add -disable-verify in such tests?
One idea is to add -verify as a opt option, and when that is used no pipeline is allow. But we should only run the special verifyModule from (2) above.

One set of ideas on how to improve things, based on keeping (2):

  • Stop adding the redundant verifier runs in the beginning of new PM pipeline.
  • Add a -verify option to opt. To be able to explicitly request just running the verifier without running any pipeline.
  • Convert tests using opt -passes=verify to use that new -verify option.

PS. Not sure if we need to deal with -verify clashing with the legacy PM syntax of running the verifier pass as part of the pipeline. Maybe we could name the new option -only-verify instead.

PS2. I haven’t thought about (1) from above or if we want to do something about that one. But for the setup we have downstream when we run clang+opt+llc as separate stepa in the compiler driver we get that extra verifier run (even with opt -disable-verify) since the frontend emits debuginfo metadata.

I’ve sent out ⚙ D139899 [opt] Do not add verify pass at beginning of pipeline to remove (3).

IMO the redundant verifier runs for opt -passes=verify not ideal but doesn’t really matter much (unless you can show that doing this would reduce test times noticeably). If we’re at the point where we really care about reducing the number of redundant verifier runs, we should create a new tool that just runs the verifier and doesn’t do anything else. opt is for running passes, but llvm/test/Verifier tests don’t care about running passes, just that the IR passes the verifier (which happens to be accessible via a pass).

As for UpgradeDebugInfo, I’m not sure whether or not checking if the debug info is valid could crash if the module doesn’t pass the verifier. If not, then we could separate out the debug info validation from the rest of the verifier and only run that part when parsing bc/ll.