[RFC] Op explosion in Linalg

Source of truth for Linalg

We should ensure that we aim for compatibility across the frameworks we plan to support. Are TOSA, PyTorch (torch-mlir), StableHLO and ONNX sufficient?

Here are relevant docs and ops I’ve found so far:

This list is meant to be representative but non-exhaustive (e.g., 3D pooling in PyTorch is omitted).

  • Please let me know if I missed an important group of Ops or a framework we should examine.
  • Should we try to scope this effort and limit it to the frameworks listed above? This would help avoid scope creep now and in the future.

1 Op vs N Ops

Building on my previous argument in favor of multiple Ops (vs. a single convolution), this approach would align with TOSA and PyTorch.

Having said that, it seems like an implementation detail - I would just prioritize whatever allows quicker progress.


Quantization

It seems that the current “zero point offset” in Linalg is basically “weight zero-point” from TOSA. So that’s not really quantization and aligns with what others pointed out:

Based on feedback, I think renaming this “field” would be sufficient for now (_q_wzp/_zpo?), and we can park this discussion. While quantization is certainly important, it could lead to scope creep here.

And yes, I’d include the variant with the offset in the new Op.


Op Format

For reference, and for others, extracted from test/Dialect/Rock/ops.mlir:

rock.conv(%filter, %input, %output) features = none {
  arch = "amdgcn-amd-amdhsa:gfx906",
  filter_layout = ["g" ,"k", "c", "0", "1"],
  input_layout = ["n", "gi", "c", "0i", "1i"],
  output_layout = ["n", "go", "k", "0o", "1o"],
  dilations = [1 : index,  1 : index],
  strides = [1 : index,  1 : index],
  padding = [0 : index,  0 : index,  0 : index,  0 : index]
} : memref<?x?x?x?x?xf16>, memref<?x?x?x?x?xf16>, memref<?x?x?x?x?xf16>

It includes all necessary info, which is beneficial! But no, not pretty :sweat_smile:


Linalg Refactor

I believe we all agree on this point (i.e. your proposal to keep “conv” as a structured Op).

Controversial question—should we discuss this here, or leave it as Step 2 (for the future) after addressing the current issues? :sweat_smile:

Padding is essential, and we need a solution, but addressing too many things at once could be counterproductive. But I agree that whatever we design, should enable (rather than block) future enhancements. Any specific thoughts on how to incorporate padding into convs?

Would changing the interface help reduce the explosion of Ops? This isn’t clear to me.

Yes, but we should work toward unifying that and making the vectorizer work for permutations beyond projections. To my knowledge, this hasn’t been prioritized.


CFA

We should try to prioritize this effort. I am available to help.

Thanks,
-Andrzej

EDIT (3/11): Added ONNX examples from @rengolin .