RFC: Pass Execution Instrumentation interface

TL;DR

Not sure how did I miss that, but this interface definitely lacks IRUnit.
Should be:

     bool BeforePass \(PassID, IRUnit&, PassExecutionCounter\);
     void AfterPass \(PassID, IRUnit&, PassExecutionCounter\);

regards,
Fedor.

We had already talked about this, so unsurprisingly I’m generally in favor of the direction. Some comments below.

  • access through LLVM Context (allows to control life-time and scope
    in multi-context execution)
  • wrap it into an analysis for easier access from pass managers

Why not simply make it an analysis, and leave LLVM context out?

Because this is very pass specific, I think it would be substantially cleaner for it to be more specifically based in the pass infrastructure.

I also think that this can be more cleanly designed by focusing on the new PM. The legacy PM has reasonable solutions for these problems already, and I think the desgin can be made somewhat simpler if we don’t have to support both in some way.

My hope would be that there are two basic “layers” to this. Along side a particular PassManager, we would have an analysis that instruments the running passes. This would just expose the basic API to track and control pass behavior and none of the “business logic”.

Then I would hope that the Passes library can build an instance of this analysis with callbacks (or a type parameter that gets type erased internally) which handles all the business logic.

I think this will also address the layering issues around IR units because I think that the generic code can use templates to generically lower the IR unit down to something that can be cleanly handled by the Passes library. I think it is generally fine for this layer to rapidly lose strong typing or only have limited typed facilities because this is about instrumenting things and shouldn’t be having interesting (non-debug) behavioral effects.

I consider LLVM context to be a good reference point for “compilation-local singleton stuff”. My task is to provide a way to handle callbacks per-compilation-context, and preferably have a single copy of those (possibly stateful) callbacks per compilation. In my implementation (linked at the end of RFC) I’m using PassInstrumentationImpl to have a single copy of object. What entity should own PassInstrumentationImpl object to make it unique per-compilation? Again, in my implementation with Analysis-managed PassInstrumentation I put Impl into PassBuilder which registers Analyses with a reference to its Impl. However that makes Impl to be per-Builder unique, which is not the same as per-compilation. That I kind of agree with. And I do not plan to implement both at once. So in a good case we just switch to new PM and go forward. And in a bad case of postponing the switch we can use experience and details of implementation of new PM to solve problems with legacy PM (but that is definitely a much lower priority for me). Yes. PassInstrumentation seems to provide that. As an idea I do agree with this. But practically I dont have a clear picture on how to manage the instance(s). regards, Fedor.

Hi Fedor,

Hi Fedor,

Conceptually I have always considered PassBuilder to be responsible only for construction of the pipeline.
Say, In our downstream usage we apply PassBuilder to construct a pipeline and get rid of it before even starting
the pipeline run. It appears to be a valid use as of right now.

If we enhance PassBuilder with bookkeeping capabilities then we will introduce the new requirement
for PassBuilder to stay alive till the end of compilation.
I’m not saying that it is a problem, it just breaks my view on a current design.

Also, keep in mind that technically you can create a pipeline without PassBuilder at all.

On other hand, using PassBuilder to own InstrumentationImpl makes implementation rather simple since it can “seed” all the analyses
with instance that it owns.

AnalysisManager owning the InstrumentationImpl instance seems conceptually clearner to me, however to make this analysis unique
we need a way to make a single analysis manager responsible for that and to teach it how to feed other analyses (without transferring
the ownership). And that requires nontrivial implementation effort which I cant estimate right now.

Is it reasonable to enforce a requirement that ModulePassManager and ModuleAnalysisManager are always created?
Then we can put all this bookkeeping activity into ModuleAnalysisManager.

I’m kinda torn on this…

regards,
Fedor.

Hi Fedor,

Thanks for replying my questions about porting the OptBisecting to new PM.

This thread looks like a great improvement on what we currently have.
Though we are also trying to make opt-bisect more granular.

In particular, we think it would be helpful if we could have opt-bisect work on a per-optimization level rather than per-pass level.
I believe this will be a more invasive change and we would like to do this as a follow-up to this thread.

How difficult do you think it would be to support this use-case with your design?

Thank you!
Zhizhou

Care to expand a bit on what you mean by per-optimization level? Preferably with a use case.

To me optbisect is a low level developer tool and it doesn’t cope well with a crude user level hammer of optimization level.

F.

I think that “level” was referring to what level of granularity the opt-bisect should control it wasn’t mean to be read as “optimization level”. I think Zhizhou was saying that it should be able to disable individual optimization steps within a pass. Like if a particular run of InstCombine made 20 changes, opt-bisect should be able to skip each of those changes. I think this is the combining opt-bisect with debug counters idea that’s been mentioned previously.

Thanks Craig, that’s exactly what I mean, stopping at particular changes inside a pass.

Would you please refer me the discuss about combining opt-bisect with debug counters? Is it already under implementation?

Thanks Craig, that’s exactly what I mean, stopping at particular changes inside a pass.

PassInstrumentation at its current base design only instruments Pass execution from a caller (PM) side.
Stopping at a particular change inside a pass definitely requires pass commitment and is out of scope
for this RFC.

Yet it appears to be rather orthogonal and I dont see anything that precludes from enhancing PassInstrumentationAnalysis
to also cover points other than those in-PMs. For sure, PassInstrumentationAnalysis should readily be available
inside the pass through the AnalysisManager.

regards,
Fedor.

I was going to write something up about fine-grained opt-bisect but
didn't get to it last week.

We've had a -pass-max option here for some time and have hand-added
instrumentation to various passes to honor it. It's saved us man-years
of debug time. I was planning on sending it upstream but saw this
effort with pass execution instrumentation and thought it might fit
there.

Initially I think some very simple APIs in PassInstrumentationAnalysis
would be fine, something like:

// PIA - PassInstrumentationAnalysis
if (PIA->skipTransformation()) {
  return;
}
// Do it.
PIA->didTransformation();

This kind of interface also encourages good pass design like doing all
the analysis for a transformation before actually doing the
transformation. Some passes mix analysis with transformation and those
are much harder to instrument to support -pass-max operation.

In our implementation we can set a -pass-max per pass, like
-pass-max=instcombine=524. A global index might be even more useful.
If it interacted with opt-bisect, even better. It seems like APIs that
cover both the opt-bisect pass-level operation and the finer-grained
operation could be quite powerful. As passes opt-in to the
finer-grained control, the opt-bisect limit would become more powerfuly
automatically.

I've always wanted a bugpoint that could point not just to a pass but to
a specific transformation within a pass.

                          -David

Fedor Sergeev via llvm-dev <llvm-dev@lists.llvm.org> writes:

I was going to write something up about fine-grained opt-bisect but
didn't get to it last week.

We've had a -pass-max option here for some time and have hand-added
instrumentation to various passes to honor it. It's saved us man-years
of debug time. I was planning on sending it upstream but saw this
effort with pass execution instrumentation and thought it might fit
there.

Initially I think some very simple APIs in PassInstrumentationAnalysis
would be fine, something like:

// PIA - PassInstrumentationAnalysis
if (PIA->skipTransformation()) {
   return;
}
// Do it.
PIA->didTransformation();

That should be easily doable (though the interface would be part of PassInstrumentation
rather than PassInstrumentationAnalysis).

This kind of interface also encourages good pass design like doing all
the analysis for a transformation before actually doing the
transformation. Some passes mix analysis with transformation and those
are much harder to instrument to support -pass-max operation.

I'm not sure everybody would agree on this definition of a good pass design :slight_smile:
Ability to mix analysis with transformation might appear to be rather useful
when heavy analysis is only needed in a very special corner case of an overall
transformation.

regards,
Fedor.

Fedor Sergeev <fedor.sergeev@azul.com> writes:

// PIA - PassInstrumentationAnalysis
if (PIA->skipTransformation()) {
   return;
}
// Do it.
PIA->didTransformation();

That should be easily doable (though the interface would be part of
PassInstrumentation
rather than PassInstrumentationAnalysis).

Ok. The way I envision this working from a user standpoint is
-opt-bisect-limit <n> would mean "n applications of code
transformation." where "code transformation" could mean an entire pass
run or individual transforms within a pass. Each pass would decide what
it supports.

This kind of interface also encourages good pass design like doing all
the analysis for a transformation before actually doing the
transformation. Some passes mix analysis with transformation and those
are much harder to instrument to support -pass-max operation.

I'm not sure everybody would agree on this definition of a good pass
design :slight_smile:
Ability to mix analysis with transformation might appear to be rather useful
when heavy analysis is only needed in a very special corner case of an
overall
transformation.

Yes, I'm sure there are exceptions. I'm not referring to things like
instcombine that have individual rules that guard transformations and
the pass iterates applying transformations when rules are matched.
That's straightforward to instrument.

The harder cases are where the analysis phase itself does some
transformation (possily to facilitate analysis) and then decides the
larger-goal transformation is not viable. If the pass then tries to
undo the first transformation, it's possible that -pass-max will result
in code that never would have been generated, because it could do the
first transformation but then not undo it because it hit the max number
of transforms. Sometimes it's difficult to find where things are undone
and update the transformation index (basically allow the undo and
decrement the index to reflect the undo).

In code:

if (not hit max)
  do anlysis transform
  ++index

return

<some other function>

if (transform legal)
  if (not hit max)
    do big transform
    ++index

return

<some third function>
if (need to undo analysis transform)
  if (not hit max)
    undo it
    ++index

Sometimes it is not obvious that these three places are logically
connected. Ideally we wouldn't increment the index for the analysis
transform or we would allow the undo and decrement the index, but it's
not always clear from the code that that is what should happen.

                            -David

[…]

The harder cases are where the analysis phase itself does some
transformation (possily to facilitate analysis) and then decides the
larger-goal transformation is not viable. If the pass then tries to
undo the first transformation, it’s possible that -pass-max will result
in code that never would have been generated, because it could do the
first transformation but then not undo it because it hit the max number
of transforms. Sometimes it’s difficult to find where things are undone
and update the transformation index (basically allow the undo and
decrement the index to reflect the undo).

It should be pointed out that analyses don’t transform the IR. At least not in the new PassManager, which I think we should focus on in this proposal.

Cheers,
Philip

Fedor Sergeev <fedor.sergeev@azul.com> writes:

// PIA - PassInstrumentationAnalysis
if (PIA->skipTransformation()) {
    return;
}
// Do it.
PIA->didTransformation();

That should be easily doable (though the interface would be part of
PassInstrumentation
rather than PassInstrumentationAnalysis).

Ok. The way I envision this working from a user standpoint is
-opt-bisect-limit <n> would mean "n applications of code
transformation." where "code transformation" could mean an entire pass
run or individual transforms within a pass. Each pass would decide what
it supports.

I would rather not merge pass-execution and in-pass-transformation numbers into a single number.
It will only confuse users on what is being controlled.
Especially as in-pass control is going to be opt-in only.

This kind of interface also encourages good pass design like doing all
the analysis for a transformation before actually doing the
transformation. Some passes mix analysis with transformation and those
are much harder to instrument to support -pass-max operation.

I'm not sure everybody would agree on this definition of a good pass
design :slight_smile:
Ability to mix analysis with transformation might appear to be rather useful
when heavy analysis is only needed in a very special corner case of an
overall
transformation.

Yes, I'm sure there are exceptions. I'm not referring to things like
instcombine that have individual rules that guard transformations and
the pass iterates applying transformations when rules are matched.
That's straightforward to instrument.

The harder cases are where the analysis phase itself does some
transformation (possily to facilitate analysis) and then decides the

As Philip has already pointed out, analyses by design are expected to be non-mutating.

regards,
Fedor.

Philip Pfaffe <philip.pfaffe@gmail.com> writes:

    The harder cases are where the analysis phase itself does some
    transformation (possily to facilitate analysis) and then decides
    the
    larger-goal transformation is not viable. If the pass then tries
    to
    undo the first transformation, it's possible that -pass-max will
    result
    in code that never would have been generated, because it could do
    the
    first transformation but then not undo it because it hit the max
    number
    of transforms. Sometimes it's difficult to find where things are
    undone
    and update the transformation index (basically allow the undo and
    decrement the index to reflect the undo).
It should be pointed out that analyses don't transform the IR. At
least not in the new PassManager, which I think we should focus on in
this proposal.

I'm not talking about analysis passes as such. I'm talking about
transformations passes that check various conditions before doing
transformations. They have to check legality, profitability, etc. Most
of the time this is well-separated but sometimes things can get pretty
convoluted and it's not always clear where the "logical changes" are, as
opposed to component changes that make up a single logical change.

                            -David

Fedor Sergeev <fedor.sergeev@azul.com> writes:

Ok. The way I envision this working from a user standpoint is
-opt-bisect-limit <n> would mean "n applications of code
transformation." where "code transformation" could mean an entire pass
run or individual transforms within a pass. Each pass would decide what
it supports.

I would rather not merge pass-execution and in-pass-transformation
numbers into a single number.
It will only confuse users on what is being controlled.
Especially as in-pass control is going to be opt-in only.

Oh, ok. I'm fine with that too. Do we want this finer-grained control
on a global basis, or a per-pass basis? For example, should something
like -transform-max=<n> apply over the whole compilation run, so that
every pass checks the limit, or should it work like
-transform-max=<pass>=<n>, where only pass <pass> checks the limit? If
the latter, then -opt-bisect-limit (or bugpoint) can identify the pass
and another run with -transform-max can identify the specific transform
within the pass.

The latter is how we have things set up here and it seems to work well,
but I can also see utility in a global limit because then you don't need
two separate runs to isolate the problem.

I'd like to start building this off the pass instrumentation stuff as
soon as it gets integrated. Could you copy me on Phabricator when they
land there? Thanks!

The harder cases are where the analysis phase itself does some
transformation (possily to facilitate analysis) and then decides the

As Philip has already pointed out, analyses by design are expected to
be non-mutating.

See my reply to Philip. I'm talking about various analyses that happen
within transformation passes.

                               -David

"David A. Greene via llvm-dev" <llvm-dev@lists.llvm.org> writes:

I'd like to start building this off the pass instrumentation stuff as
soon as it gets integrated. Could you copy me on Phabricator when they
land there? Thanks!

BTW, I am "greened" on Phabricator.

                             -David

Fedor Sergeev <fedor.sergeev@azul.com> writes:

Ok. The way I envision this working from a user standpoint is
-opt-bisect-limit would mean “n applications of code
transformation.” where “code transformation” could mean an entire pass
run or individual transforms within a pass. Each pass would decide what
it supports.
I would rather not merge pass-execution and in-pass-transformation
numbers into a single number.
It will only confuse users on what is being controlled.
Especially as in-pass control is going to be opt-in only.

Oh, ok. I’m fine with that too. Do we want this finer-grained control
on a global basis, or a per-pass basis? For example, should something
like -transform-max= apply over the whole compilation run, so that
every pass checks the limit, or should it work like
-transform-max==, where only pass checks the limit? If
the latter, then -opt-bisect-limit (or bugpoint) can identify the pass
and another run with -transform-max can identify the specific transform
within the pass.

This seems to be pretty much orthogonal to the pass manager instrumentation. In fact, there is nothing keeping you from implementing this for your pass right now using debug counters. That’s mostly their job, and they are independent of the pass manager implementation.

The latter is how we have things set up here and it seems to work well,
but I can also see utility in a global limit because then you don’t need
two separate runs to isolate the problem.

I’d like to start building this off the pass instrumentation stuff as
soon as it gets integrated. Could you copy me on Phabricator when they
land there? Thanks!

The harder cases are where the analysis phase itself does some
transformation (possily to facilitate analysis) and then decides the
As Philip has already pointed out, analyses by design are expected to
be non-mutating.

See my reply to Philip. I’m talking about various analyses that happen
within transformation passes.

I see, then I just misunderstood what you meant by analysis. I believe what you were going here for can as well be implemented on top of debug counters.

Cheers,
Philip