Semantics of linalg MatmulOp

Consider linalg MatmulOp like this:
module {
func.func @test_matmul(%arg0: tensor<1x8192x8192xf32>, %arg1: tensor<1x8192x8192xf32>) → tensor<1x8192x8192xf32> {
%cst = arith.constant 0.000000e+00 : f32
%0 = tensor.empty() : tensor<1x8192x8192xf32>
%1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<1x8192x8192xf32>) → tensor<1x8192x8192xf32>
%2 = linalg.batch_matmul ins(%arg0, %arg1 : tensor<1x8192x8192xf32>, tensor<1x8192x8192xf32>) outs(%1 : tensor<1x8192x8192xf32>) → tensor<1x8192x8192xf32>
return %2 : tensor<1x8192x8192xf32>
}
}
if %1 fills with other vector<1*8192x8192xf32> that have initial value. Does linalg.batch_matmul just calculate matmul of %arg0, %arg1 then generate %2 or calculate and add %1 then generate %2 ?

The semantics here is of accumulation, not addition. Basically C += A x B.

If the init tensor (%1) is initialized as all-zeroes (in your case above), then the accumulation is on a zero init memory and the result is just %2 = %arg0 x %arg1, or C = A x B (technically C += A x B with C = 0).

If the init is non-zero, then the accumulation is done on non-zero memory, which “adds” (by accumulation) on the existing values, essentially doing %2 = %1 + %arg0 x %arg1, or C += A x B.

Note that there is no add op here after the matmul, it’s just accumulation on a pre-existing tensor.

Thx for your relpy