[RFC] CanonicalizerPass convergence / error handling

Coming a bit late to the party, thanks for the great discussion.

Orthogonal to the canonicalizer pass or canonicalization patterns per se, I would like we focus a bit more on patterns.

I find it distressing that the main pattern driver in MLIR promotes just giving up on pattern hygiene. IMO, we shouldn’t just accept that “patterns are broken” is a fact of life and everything is best effort within k iterations.

I would welcome general cleanups that would significantly harden the APIs and make it much much harder to misuse, some examples:

  1. force all IR creation and mutation to go through some builder/rewriter (i.e. make all Operation and Block APIs private with a few friends) and have builders/rewriters properly listen to all IR mutation/creation. Patterns are then never responsible for returning failure or success and the rule simply becomes “return the listener state”. This removes a huge footgun that we currently allow: return failure when IR has changed (unknown true full implications as things kinda seem to workish … maybe) or return success when IR has not changed (infinite loops).
  2. Alternatively, we could use the mechanism in 1. as a check with clear error messages: i.e. pattern X returned failure but IR was modified.
  3. Simpler Builder/Rewriter/Listener class hierarchy, I don’t think having 5-deep class hierarchy to build stuff, with incomplete Listener capabilities amongst the various classes, pays for itself.
  4. Add a mechanism to check/enforce pattern orthogonality and compositionality within sets of patterns; e.g. given a set of patterns with the same tag, at most one can apply to a given op at any point in time (can also filter by “same benefit”). This would greatly reduce potential surprises from order of application of patterns and generally reduce the pattern interference problem. Note this would probably require reverting back the match / rewrite APIs so we can run the match ahead of time and verify the count is <= 1.

Optional stretch goal: passes that rely on repetitive pattern applications should define a notion of “energy potential” of the IR. The energy potential must decrease with each application (expensive check). Monotonicity guarantees termination and makes it easy to pinpoint offending patterns.

I suspect in the current state, with all the sharp edges we have, better documentation without making the APIs hard to misuse, will not be enough. We should still improve documentation everywhere it makes sense.

4 Likes