The round table follows the recent topics:
And is related to this lightning talk at EuroLLVM:
We began by making a widely accepted statement that “canonicalization is not required for correctness” of either passes of lowering, however, we also made the strong point that it is only useful if it simplifies matching against commonly expected IR forms for optimization transformations.
So, while a non-canonical form should always be lowered to base dialects (like llvm) in the end, they won’t always be recognized and optimized before that. Since we’re building optimizing compilers, the whole point of such process is to ease matching and increase optimization. Since we’re building multiple upstream and downstream compilers, an upstream canonical form needs to cater to all of them in the best possible way and not be “best effort”.
We started with the discussion about @ftynse example of the canonicalize pass increasing compile time from minutes to hours. We couldn’t come up with a reason that’d have such an impact in compile time, since the matchers fail early. Since Alex wasn’t at EuroLLVM, we decided to skip this discussion and investigate later.
We speculated there are certain adversarial patterns where two or more rewrites keep adding work items to the queue in a multi-fixed-point scenario, but more data is needed to know for sure. But the key aspect of not wanting the canonicalization to run everything every time is still there.
The reasons I collected were:
- Not wanting to run irrelevant passes at every level when a subset would work equally, and have less chance of the scenario above. Less scope, less interference.
- There is no guarantee the canonicalization will finish, let alone finish at a particular point, let alone at a canonical representation. In a nutshell, the
canonicalizepass does not canonicalize the IR. - Since the rewrites are defined at an operation level, there’s no expectation that they’ll consider (or know about) other operations’ own patterns. Unintentional interference is likely and unpredictable.
- Since operation rewrites are allowed to create any arbitrary graph, there’s no control on the final state of the IR, and the generated new operations cannot control in which state they were created by other rewrites.
- There were other variations on those themes…
There was a consensus that canonicalization should have a strong mandate to not only terminate, but also to generate an expected, canonical form. While generally accepted that canonical forms depend on the use (transforms / destinations), there was agreement that such a canonical form upstream would provide value to users, even if other normal forms were also accepted / allowed.
We also discussed “canonicalization by construction”, where transforms should (best effort) generate and/or maintain canonical (or even normal) forms as much as possible.
A discussion in linalg was used as a proxy:
- Tiling a named op (
linalg.add) should produce a loop of named ops ({ scf.for { linalg.add } }). Currently, most (all?) tiling produceslinalg.generic. - Named linalg ops (one of its normal forms) are easier for loop fusion, while generic ops (another normal form) are easier for linalg fusion. We can, and want to, have both at different points.
- Canonical operation syntax is not the same as canonical dialect usage:
- A linalg with affine maps inside the op or outside the op are the same op represented differently. We should allow only one of them as the canonical form.
- A sequence of linalg ops (e.g.
transpose+matmul) is the same as agenericwith transpose affine map and both forms should be allowed.
- An operation that has more than one normal form does not have a canonical form, and perhaps should not have a canonicalization pattern (but perhaps multiple normalization patterns).
A proposal to describe canonical forms as requirements was followed from @ftynse’s PR. His work allows the transform dialect to encode verifiers for a canonical form on the matcher (returns null if not canonical). A question was raised: can we do that in a more generic way?
This is, in spirit, similar to my own RFC above, but it goes further. Can we describe what I want from previous passes? For example, which forms need to exists, which dialects must not exist? One concrete example was: “Give me linalg in category form where the scf loops are parallel and not forall”.
Of course, a combinatorial explosion of rewrite patterns that convert forms would do the trick, with each pass/rewrite that needs a particular form mandating those specific A-to-B patterns to run before, but that doesn’t scale. However, it was accepted that the idea is good, but implementation is invasive and fuzzy.
Action items:
- Understand the high cost of the current canonicalize pass
- Continue working on transforms and canonical matchers
- Inspect interference between canonicalization rewrites and propose solutions
- Work towards a model where canonicalization not only terminates but also in a guaranteed canonical form
Further discussions:
- How to split canonicalization patterns to make it easier to compose
- Create a prototype where such composition is beneficial (linalg based would be easier)
- How to enquire those rewrites for forms, guarantees and invariants and reach them from outside an “all-or-nothing” greedy canonicalize pass.
- Reduce the number of opinionated best effort canonical rewrites and re-brand them as normal form conversions (like we have in linalg), at least until we can agree on forms that are canonical