I found that the output of mlir/test/Dialect/Linalg/drop-unit-extent-dims.mlir contains linalg.generic that does summation but its iterator type still parallel:
This test case is a bit confusing in the sense that the remaining input dimension is of size “?” while the output dimension is of size 1. However, since both input and output tensor are one-dimensional we cannot perform a reduction here. Instead it is a normal parallel iteration which implies the size of the input tensor is actually 1 (so “?” == 1).
A reduction is only possible if the output tensor has less dimensions than the input tensor. In particular, we cannot index the output tensor with a reduction dimension.
I hope this helped clarifying?
PS Also note in the test input there are two reduction dimensions that both have size one and as a result are dropped…
A reduction is only possible if the output tensor has less dimensions than the input tensor. In particular, we cannot index the output tensor with a reduction dimension.
It makes sense to me, but I am not still convinced because the loop body is adding %arg1 and %arg2 where the latter is the result of the previous iteration.
Or, is this a special case that is allowed when the input and output tensors have dimensions of size one?
You can read from an output tensor if it contains values (otherwise it is undefined behavior). In absence of a reduction iterator you will read the initial value stored in the output tensor. Otherwise, you read the initial value in the first iteration of the reduction and the accumulated value in later iterations. In the example, the output tensor is filled with ones meaning the operation will add one to the value in the input tensor.
However, the example is a bit contrived since the original operation was actually performing a reduction of size one. An parallel operation would normally read from the input tensors only.
In absence of a reduction iterator you will read the initial value stored in the output tensor. Otherwise, you read the initial value in the first iteration of the reduction and the accumulated value in later iterations. In the example, the output tensor is filled with ones meaning the operation will add one to the value in the input tensor.