In LLVM IR, type [a... x <b x T>] has an alignment requirement on the vector type inside. By default, it will require <b x T> to be aligned as the next 2’s power (e.g., for <3 x T>, it will align to sizeof(4xT)). So [a... x b x T] is generally not the same thing as [a ... x <b x T>] unless b is 2’s power.
So here comes the problem in operation vector.type_cast. It cast a memref<a... x b x T> into memref<vector<a... x b x T>> without touching the data inside. In MLIR the vector<a... x b x T> is currently translated into [a... x <b x T>] in LLVM. So if b is NOT a power of 2, this cast will cause very strange behavior.
Considering the example above:
memref.global "private" @gv0 : memref<2x3xi32> = dense<[[1, 2, 3], [4, 5, 6]]>
func.func @main() {
%mem0 = memref.get_global @gv0 : memref<2x3xi32>
%mem1 = vector.type_cast %mem0 : memref<2x3xi32> to memref<vector<2x3xi32>>
%v1 = memref.load %mem1[] : memref<vector<2x3xi32>>
%p12 = vector.extract %v1[1, 0] : vector<2x3xi32>
vector.print %p12 : i32
return
}
It will print out 0 instead of 6 because when translated into LLVM IR, LLVM assumes a [2 x <3 x i32>] (%v1 here) will have <3 x i32> aligned as 16 bytes (4 *sizeof(i32)), so the vector.extract operation in the example will be translated into an accession to address @gv0 + 16 instead of @gv0 + 12, giving a wrong output. I put the translated LLVM IR in godbolt to see the generated assembly code and confirmed this.
IMO, vector.type_cast should restrict its argument type to memref<a... x b x T> where b is a power of 2, instead of accepting arbitrarily memref. But since I am not very familiar with the design and excepted usage for vector.type_cast, I decide to post here for more opinions on this strange behavior of vector.type_cast before I begin to add this check into it. Once we reach a consensus, I am happy to write and submit a patch to fix it.