This is a followup on this question http://discourse.llvm.org/t/window-iterator-type-for-linalg-genericop/3967
I’m attempting to represent a horizontal blur operation using linalg.generic
. The IR I am currently generating is this:
#map0 = affine_map<(d0, d1) -> (d0, d1 - 1)>
#map1 = affine_map<(d0, d1) -> (d0, d1)>
#map2 = affine_map<(d0, d1) -> (d0, d1 + 1)>
module @global_scope {
func @sumKernel(%arg0: tensor<1x3xi16>) -> i16 {
%c0 = constant 0 : index
%c1 = constant 1 : index
%c2 = constant 2 : index
%0 = tensor.extract %arg0[%c0, %c0] : tensor<1x3xi16>
%1 = tensor.extract %arg0[%c0, %c1] : tensor<1x3xi16>
%2 = addi %0, %1 : i16
%3 = tensor.extract %arg0[%c0, %c2] : tensor<1x3xi16>
%4 = addi %2, %3 : i16
return %4 : i16
}
func @main(%arg0: tensor<480x640xi16>) -> tensor<480x640xi16> {
%c0_i16 = constant 0 : i16
%0 = linalg.pad_tensor %arg0 low[0, 1] high[0, 1] {
^bb0(%arg1: index, %arg2: index): // no predecessors
linalg.yield %c0_i16 : i16
} : tensor<480x640xi16> to tensor<480x642xi16>·
%1 = splat %c0_i16 : tensor<480x640xi16>
%2 = linalg.generic {indexing_maps = [#map0, #map1, #map2, #map1], iterator_types = ["parallel", "window"]} ins(%0, %0, %0 : tensor<480x642xi16>, tensor<480x642xi16>, tensor<480x642xi16>) outs(%1 : tensor<480x640xi16>) {
^bb0(%arg1: i16, %arg2: i16, %arg3: i16, %arg4: i16): // no predecessors
%3 = tensor.from_elements %arg1, %arg2, %arg3 : tensor<3xi16>
%c1 = constant 1 : index
%c3 = constant 3 : index
%4 = tensor.from_elements %c1, %c3 : tensor<2xindex>
%5 = tensor.reshape %3(%4) : (tensor<3xi16>, tensor<2xindex>) -> tensor<1x3xi16>
%6 = call @sumKernel(%5) : (tensor<1x3xi16>) -> i16
linalg.yield %6 : i16
} -> tensor<480x640xi16>
return %2 : tensor<480x640xi16>
}
}
This seems more or less in keeping with the answer from my previous post, other than the fact that it is making use of statically sized tensors. The intent of the above example is to produce a new tensor whose elements are the sum a 1x3 window in the input tensor. It first pads the input tensor to handle the boundary conditions.
The problem I run into is that the linalg.generic
operation above seems to fail to legalize. The problem seems related to the supplied affine maps. This is the error given with the example above:
error: 'linalg.generic' op unexpected result less than 0 at expression #1 in (d0, d1) -> (d0, d1 - 1)
The error seems related to the process of inferring iteration space dimensions. If I change the indexing maps to
#map0 = affine_map<(d0, d1) -> (d0, d1)>
#map1 = affine_map<(d0, d1) -> (d0, d1 + 1)>
#map2 = affine_map<(d0, d1) -> (d0, d1 + 2)>
This results in a different error:
error: 'linalg.generic' op inferred input/output operand #1 has shape's dimension #1 to be greater than or equal to 643, but found 642
I’ve tried other variants on this with different indexing maps given in different orders, but the end result is one of these two errors.
Is there anything obviously wrong I’m doing in this example? Or am I running into some other limitation?