I’d like to understand the semantics of reinterpret_cast
, and currently I’m struggling over the following example, which I’ve formatted as a lit test. In summary, what I’d like to express is:
- Define a 4xi32 alloc
- Fill it with some data so that it looks like
%alloc_0 = [0, 1, 2, 3]
- Reinterpret-cast it into a new memref that has an offset of 1 (effectively, it should be a view of
alloc_0
like%alloc_1 = [1, 2, 3]
). - Load index 1 of
alloc_1
- Confirm the result is 2.
// RUN: mlir-opt %s -pass-pipeline=" \
// RUN: builtin.module(lower-affine, \
// RUN: normalize-memrefs, \
// RUN: finalize-memref-to-llvm, \
// RUN: func.func(convert-scf-to-cf, convert-arith-to-llvm), \
// RUN: convert-func-to-llvm, \
// RUN: reconcile-unrealized-casts)" \
// RUN: | mlir-cpu-runner -e main -entry-point-result=i32 > %t
// RUN: FileCheck %s < %t
// CHECK: 2
func.func @main() -> i32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%c3 = arith.constant 3 : index
%c0_i32 = arith.constant 0 : i32
%c1_i32 = arith.constant 1 : i32
%c2_i32 = arith.constant 2 : i32
%c3_i32 = arith.constant 3 : i32
%alloc_0 = memref.alloc() : memref<4xi32>
affine.store %c0_i32, %alloc_0[%c0] : memref<4xi32>
affine.store %c1_i32, %alloc_0[%c1] : memref<4xi32>
affine.store %c2_i32, %alloc_0[%c2] : memref<4xi32>
affine.store %c3_i32, %alloc_0[%c3] : memref<4xi32>
%alloc_1 = memref.reinterpret_cast \
%alloc_0 to offset: [1], sizes: [4], strides: [1] \
: memref<4xi32> to memref<4xi32, strided<[1], offset: 1>>
%arg0 = arith.constant 1: index
%1 = affine.load %alloc_1[%arg0] : memref<4xi32, strided<[1], offset: 1>>
return %1 : i32
}
When I run this, I get the following error, I assume, from the verifier:
error: expected result type with size = 4 instead of 5 in dim = 0
%alloc_1 = memref.reinterpret_cast
%alloc_0 to offset: [1], sizes: [4], strides: [1]
: memref<4xi32> to memref<4xi32, strided<[1], offset: 1>>
If I change to an offset of zero the above compiles and runs without a problem, outputting 1
, as expected, because the reinterpret_cast is a no-op. I have also tried a variety of other settings for the sizes and result memref type, and they all produce similar errors.
For example, I thought maybe the output memref type should be 3xi32
, since the offset means there are only three i32’s in its view. That produces this error:
error: expected result type with size = 3 instead of 4 in dim = 0
%alloc_1 = memref.reinterpret_cast
%alloc_0 to offset: [1], sizes: [3], strides: [1]
: memref<4xi32> to memref<3xi32, strided<[1], offset: 1>>
Similar problems occur when setting sizes : [4]
while the output type is 3xi32
and vice versa.
So now I’m at a loss. It seems like I just completely misunderstand the semantics of this operation, or else there is a bug. For evidence of the latter, I noticed that if I leave sizes: [4]
and change the output memref size to something weird, like 9xi32
, it errs with expected result type with size = 4 instead of 9 in dim = 0
, and ditto for a 3xi32
, it says "instead of 3"
. But for BOTH 4xi32
and 5xi32
it thinks the actual dimension is 5 (instead of 5
).
Could someone help me understand this situation?