Checks for IR modification outside of pass root op?

I just spent 2 hours debugging a situation where I had a pass incorrectly nested. As an example:




(presuming input IR that was nested as builtin.module/func.fuc)

The pass was not restricted to operate on builtin.module because it is generic to top-levels and there is not a good way to do that.

The pass in question was a pretty vanilla dialect conversion based pass that does the usual function/type-conversion dance (i.e. clone the func without regions, fixup, inline regions into the new func, apply type conversion to the blocks, erase the original func). It failed both spectaularly and subtly in ways that… shall we just say that I understand how dialect conversion internals operates a lot more now, but none of that was relevant to the issue at hand.

This isn’t the first time I’ve gotten this wrong, but it was definitely the most time spent scratching my head (the specific way this was presenting sent me down a long trail of red herrings).

So that got me thinking: are there any kinds of debug checks we could do that would assert if explicitly modifying operations outside of the current “parallelism scope” (do we have a name for this)? I don’t know how to do this generally and not restrict valid functionality, but it seems like it would be possible to restrict asking the rewriter to modify operations without the right ancestor. In a call like this, is it ever valid to rewrite outside of the operation passed to applyPartialConversion (in my situation, I was inadvertently manipulating the node list of the parent):

    if (failed(applyPartialConversion(getOperation(), conversionTarget,
                                      std::move(patterns)))) {
      getOperation()->emitError() << "conversion to util failed";
      return signalPassFailure();

Or, if it is, is there a way to otherwise specify the allowable, mutable root scope?

I don’t have a more concrete suggestion but was hoping those who know the pass/conversion infra better might have an idea to make this a bit more safe. The current situation gets very ugly very fast (asan issues, dangling ops that should have been deleted triggering red herring verification issues, corrupting things so that print-ir-after-all/debug crash, etc).

I couldn’t understand from this post what issue you encountered and how it’s connected to introducing “checks for IR modification outside of pass root op” – the latter would anyway be useful if feasible.

Did you mean func.func instead of builtin.module?

What kind of pass was my-foobared-pass? Was it just a Pass when in reality your intention was to have it as a pass to run on ModuleOp? In such a case, you’d have had a cast failing if you nested it on func.func.

It is a pass that was mutating functions at the module level but which I incorrectly (in a lit test – not in the actual integration) nested at the func level. It was a dumb mistake and one that I usually would have spotted had it not manifested in a sequence of red herrings – and it took a lot of time.

my-foobared-pass was just a Pass that did an applyPartialConversion on the pass’s root operation. Its patterns included conversion patterns for FuncOp which did a type conversion on each function (effectively replacing the function at the top level). We happen to have a few passes that are not restricted to ModuleOp because their actual restriction is that they must run at the “parent of functions” level and we have a couple of top level module-op-like things (and, unlike functions, we do not have a ModuleOpInterface or equivalent to restrict things like this to).

Leaving aside why it was like this, it is very easy to include patterns in dialect conversion that can operate at a level that is not allowed: you have to know what each is doing and then target the pass accordingly. That is, of course, part of the job of putting things together – I just wish that the cost of getting it wrong (at least in debug mode) was not random memory corruption and crashes. Since there is no part of the API contract where this behavior is explicit, it is just left to documentation and memory. It can be even trickier to spot sometimes because it is legal to mutate the root op (i.e. change attributes, etc) but not to do anything that changes its parent (like cloning/replacing it/etc). I’ve run into this a number of times over the years and am just asking whether there is a way to build in better guardrails. All of the times were dumb mistakes. In each, an assertion would have been far preferable to corruption.

We thought of things to do, but it is tricky because violations are UB in general ( TSAN should flag them).
We could catch them when running in single thread mode though.

A strategy I tried to look into is for the pass manager to “detach” an operation from the parent block before running a pass on it. It would make it structurally safe (the API makes it so that we can’t easily misuse…).
But that is quite a big change from the current behavior.