[RFC] Op explosion in Linalg

banach-space · November 2, 2024, 8:19pm

Source of truth for Linalg

We should ensure that we aim for compatibility across the frameworks we plan to support. Are TOSA, PyTorch (torch-mlir), StableHLO and ONNX sufficient?

Here are relevant docs and ops I’ve found so far:

TOSA: conv2d, conv3d, avg_pool2d, max_pool2d, depthwise_conv2d,
PyTorch: conv1d, conv2d, conv3d, max_pool1d, max_pool2d, avg_pool1d, avg_pool2d (…),
StableHLO: convolution,
ONNX: Conv, QLinearConv.

This list is meant to be representative but non-exhaustive (e.g., 3D pooling in PyTorch is omitted).

Please let me know if I missed an important group of Ops or a framework we should examine.
Should we try to scope this effort and limit it to the frameworks listed above? This would help avoid scope creep now and in the future.

1 Op vs N Ops

Building on my previous argument in favor of multiple Ops (vs. a single convolution), this approach would align with TOSA and PyTorch.

Having said that, it seems like an implementation detail - I would just prioritize whatever allows quicker progress.

Quantization

It seems that the current “zero point offset” in Linalg is basically “weight zero-point” from TOSA. So that’s not really quantization and aligns with what others pointed out:

Based on feedback, I think renaming this “field” would be sufficient for now (_q → _wzp/_zpo?), and we can park this discussion. While quantization is certainly important, it could lead to scope creep here.

And yes, I’d include the variant with the offset in the new Op.

Op Format

For reference, and for others, extracted from test/Dialect/Rock/ops.mlir:

rock.conv(%filter, %input, %output) features = none {
  arch = "amdgcn-amd-amdhsa:gfx906",
  filter_layout = ["g" ,"k", "c", "0", "1"],
  input_layout = ["n", "gi", "c", "0i", "1i"],
  output_layout = ["n", "go", "k", "0o", "1o"],
  dilations = [1 : index,  1 : index],
  strides = [1 : index,  1 : index],
  padding = [0 : index,  0 : index,  0 : index,  0 : index]
} : memref<?x?x?x?x?xf16>, memref<?x?x?x?x?xf16>, memref<?x?x?x?x?xf16>

It includes all necessary info, which is beneficial! But no, not pretty

Linalg Refactor

I believe we all agree on this point (i.e. your proposal to keep “conv” as a structured Op).

Controversial question—should we discuss this here, or leave it as Step 2 (for the future) after addressing the current issues?

Padding is essential, and we need a solution, but addressing too many things at once could be counterproductive. But I agree that whatever we design, should enable (rather than block) future enhancements. Any specific thoughts on how to incorporate padding into convs?

Would changing the interface help reduce the explosion of Ops? This isn’t clear to me.

Yes, but we should work toward unifying that and making the vectorizer work for permutations beyond projections. To my knowledge, this hasn’t been prioritized.

CFA

We should try to prioritize this effort. I am available to help.

Thanks,
-Andrzej

EDIT (3/11): Added ONNX examples from @rengolin .

Topic		Replies	Views
[RFC] Primitive Ops: add MapOp, ReductionOp, TransposeOp, BroadcastOp to Linalg MLIR	33	2230	August 8, 2022
Notes from the MLIR Upstream Round Table @ EuroLLVM 2024 EuroLLVM	18	1150	May 14, 2024
[RFC] Declarative "named ops" in the Linalg dialect Tensor Compiler	34	2521	March 31, 2020
[RFC] Transpose attribute for Linalg matmul operations MLIR linalg	41	948	August 13, 2024
[RFC][MLIR] Linalg operation tree MLIR linalg	7	490	December 13, 2024

[RFC] Op explosion in Linalg

Related topics