If TOSA lowers to a sequence of generics, then our matcher can help the first stage of detecting which kind of generics they are. You can either convert them to named ops and then match the def-use chain or do that directly in the compiler.
I’d also like to extract some higher level matcher that can see across multiple ops and match a graph with a simple declarative form. The way, we could bundle the underlying ops into GeLU, etc. Again, this can be either as a grouped op or solely inside the compiler.
The complexity comes when we start getting casts. Broadcast change the shape of the inputs, and need generic descriptions of “constants” (for ex. max(x, zero)
or splat bias).
It’ll also need to understand multi-dimensional arguments. A 2D matmul can be tiled into an ND block-matmul with fused element-wise, forming patterns that you can potentially only uncover after tiling and fusing. Or type packing
(VNNI, BFMMLA) where some arguments have an extra dimension created by a type transpose.
Generic descriptions for all these shapes isn’t easy, and creating a long list of match-patterns (like we did) does not scale. We need something that composes well (can reuse patterns) but also something that does not have square or cubic complexity.
We discussed a tree-matching system where the descriptive language builds a map of all matchers, which gets flattened into a decision tree (possibly a dag), and this tree is the pattern-matcher. The initial stage is finding the right “entry point”, and from there, re-use matchers until it either finds a pattern or reaches the end of the branch, signaling failure.
But we never got to implement this internally, as such complexity is better designed and implemented upstream.