Linalg.generic: issue understanding indexing map semantics

I’ve been trying to understand this code snippet with a linalg.generic operation for quite a while now:

%1 = arith.constant dense<
        [[[10, 20]]]> : tensor<1x1x2xi64>
%2 = arith.constant dense<-1> : tensor<1xi64>
%3 = arith.constant dense<
        [[[0], 
          [-2]], 
         [[-1], 
          [0]]]> : tensor<2x2x1xi64>
    
%out = linalg.generic {indexing_maps = [
            affine_map<(d0, d1) -> (d1, d1, d0)>, // accesses %1[0,0,0], %1[0,0,1]
            affine_map<(d0, d1) -> (d1)>,         // accesses %2[0],     %2[0]
            affine_map<(d0, d1) -> (d0, d0, d1)>  // accesses %3[0,0,0], %3[1,1,0] 
            ], iterator_types = ["reduction", "reduction"]} 
        ins(%1, %2 : tensor<1x1x2xi64>, tensor<1xi64>) 
        outs(%3 : tensor<2x2x1xi64>) {
    ^bb0(%in: i64, %in_5: i64, %out: i64):
      linalg.yield %in : i64
    } -> tensor<2x2x1xi64>
    // expected:
    // [[[10], 
    //   [-2]],
    //  [[-1], 
    //   [20]]]
    // ...right?

I’ve traced through the steps by hand, with what I understand the code to do in the comments. The operation essentially updates a diagonal of a tensor with another one.

However, when I execute the code and print the results using the interpreter, I get the following instead for the value of %0:

[[[10], 
  [-4702111234474983746]], 
 [[-4702111234474983746], 
  [20]]]

Those huge negative numbers appeared out of nowhere! It’s not quite what I expected - I’m wondering where my understanding went wrong?

The full code for printing the outputs is here: Compiler Explorer
And can be run by:
mlir-opt file.mlir -one-shot-bufferize -func-bufferize -cse -canonicalize -convert-vector-to-scf -test-lower-to-llvm | mlir-cpu-runner -e main --entry-point-result void --shared-libs="lib/libmlir_c_runner_utils.so,lib/libmlir_runner_utils.so"

I believe there is an implicit convention that linalg always defines the full output, and does not preserve original values. That is, the range of the result indexing map must cover the entire coordinate space of the result. This convention is violated here.

Interesting - that’s something I didn’t know about. Thank you very much!