Vectorization of convolution op

messadki · February 23, 2022, 10:20am

Hi Everyone,
I was trying to vectorize linalg.conv_2d but I got the following error:

mlir-opt: /stck/moessadk/Workspace/MLIR/new_llvm/llvm-project/mlir/lib/IR/AffineMap.cpp:683: mlir::AffineMap mlir::inverseAndBroadcastProjectedPermuation(mlir::AffineMap): Assertion `map.isProjectedPermutation( true)' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:...

could you explain please what does mean Projected permuation map? is it a mapping which reduces the rank (for example (i,j)->i) ?

The code that I used is

func @compute_rhs(%arg0: memref<50x50xf64>, %arg1: memref<48x48xf64>, %arg2: memref<3x3xf64>, %arg3: f64, %arg4: f64) -> memref<48x48xf64> {
   linalg.conv_2d ins(%arg0, %arg2 : memref<50x50xf64>, memref<3x3xf64>) outs(%arg1 : memref<48x48xf64>)
    return %arg1 : memref<48x48xf64>
  }

and I used the following codegen strategy
mlir-opt -test-linalg-codegen-strategy=“anchor-func=compute_rhs3 anchor-op=linalg.conv_2d tile-sizes=1,4 vectorize”

@nicolasvasilache @ftynse

nicolasvasilache · February 23, 2022, 10:25am

Can you please file a bug on github?
This should fail graciously and not crash.

Generally to vectorize a 2-d conv, you tile on certain sizes by 1 (i.e. the H and KH dimensions) and then use linalg-strategy-decompose-pass to go to a 1-d conv (normal or depthwise).

The 1-d conv is the one that vectorizes directly.

messadki · February 23, 2022, 1:23pm

I reported the issue in vectorization of linalg.conv_2d crash · Issue #54020 · llvm/llvm-project · GitHub

Could you elaborate further how to tile correctly. I try the following options
mlir-opt -test-linalg-codegen-strategy=“anchor-func=compute_rhs3 anchor-op=linalg.conv_2d tile-sizes=1,1 decompose”
to convert to 1-d convolution. I tested other tilings (1,m) and (m,1) before “decompose” but I did not succeed to convert conv_2d to conv_1d.

antiagainst · February 23, 2022, 2:10pm

Convolution differs from matmul in the sense that it has more complicated indexing map for accessing the Input:

github.com

llvm/llvm-project/blob/d50571a/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml#L1045-L1046


      
          - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8,
            s9, s10] -> (d0, d1 * s2 + d4 * s4, d2 * s6 + d5 * s8, d6)>

You can see in the above there are multiplications and additions among various dimensions and symbols.

While for matmul, we have (i, j, k) -> (i, k) for A, (i, j, k) -> (k, j) for B, and (i, j, k) -> (i, j) for C.

The indexing map for matmul in the above is projected permutations, but not for the convolution Input access.

The way we generate code in MLIR for convolution is trying to decompose it progressively to a reduction like matmul. On observation we have regarding convolution is that if it’s a convolution with filter window sizes all equal to one, then it’s a matmul. So that gives us hints as for how to decompose the problem: driving the window dimensions to ones—

First you’d want to tile the convolution along N, OH, OW, OC to a smaller scale and make sure at least one of OH and OW is one. Then you’d want to tile along KH and KW to make sure at least one of them is one. Then you can use the decompose pattern

github.com

llvm/llvm-project/blob/d50571ab07e1fc1761cf2a884459fe4892ec75f1/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp#L1118-L1124


      
          void linalg::populateDecomposeConvolutionPatterns(
              RewritePatternSet &patterns, const LinalgTransformationFilter &filter,
              PatternBenefit benefit) {
            patterns.add<DownscaleSizeOneWindowed2DConvolution,
                         DownscaleDepthwiseConv2DNhwcHwcOp>(patterns.getContext(), filter,
                                                            benefit);
          }

to convert the 2-D convolution into a 1-D one and then vectorize.

For example, for the following convolution:

func @conv(%input: tensor<1x225x225x3xf32>, %filter: tensor<3x3x3x32xf32>, %init: tensor<1x112x112x32xf32>) -> tensor<1x112x112x32xf32> {
  %0 = linalg.conv_2d_nhwc_hwcf
    {dilations = dense<1> : tensor<2xi64>, strides = dense<2> : tensor<2xi64>}
    ins(%input, %filter : tensor<1x225x225x3xf32>, tensor<3x3x3x32xf32>)
    outs(%init : tensor<1x112x112x32xf32>)
  -> tensor<1x112x112x32xf32>
  return %0: tensor<1x112x112x32xf32>
}

If you try

bin/mlir-opt -test-linalg-codegen-strategy="anchor-op=linalg.conv_2d_nhwc_hwcf tile-sizes=0,1,8,8,1 decompose vectorize" conv.mlir

It’ll properly tile and decompose and vectorize.

Note in the above the tile sizes 0,1,8,8,1 follows iterator definition order for linalg.conv_2d_nhwc_hwcf:

github.com

llvm/llvm-project/blob/d50571ab07e1fc1761cf2a884459fe4892ec75f1/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py#L258-L259


      
          implements(ConvolutionOpInterface)
          domain(D.n, D.oh, D.ow, D.f, D.kh, D.kw, D.c)

which is tiling along OH and KH as one.

messadki · February 23, 2022, 3:23pm

thank you @antiagainst for your clarification. I am able now to correctly vectorize linalg.conv_2d_nhwc_hwcf but not linalg.conv_2d (convolution version without channels). I looked little bit to the code, it seems that there is no pattern rewrite to down-scale conv_2d op to conv_1d op!!

antiagainst · February 23, 2022, 3:37pm

Yup. Keep in mind that lots of the features in MLIR are developed on concrete needs. We haven’t run into a case where we need the normal linalg.conv_2d thus far. If you need it, please feel free to send patches to wire it up! It shouldn’t be too hard with the fully working example of linalg. conv_2d_nhwc_hwcf. You can reuse the code there to a large extent I think.

Topic		Replies	Views
How to optimize linalg.conv2d MLIR	5	303	February 8, 2024
Understanding Vectorization Failure with transform.structured.vectorize MLIR mlir	4	117	June 25, 2024
[RFC] Linalg op flattening MLIR	3	188	February 14, 2024
[MLIR][Linalg] Vectorization Fail with reduction MLIR	3	114	June 29, 2024
After Linalg tiling can't lower to llvm MLIR	4	383	November 1, 2023

Vectorization of convolution op

Related topics