In LLVM IR, type [a... x <b x T>]
has an alignment requirement on the vector type inside. By default, it will require <b x T>
to be aligned as the next 2’s power (e.g., for <3 x T>
, it will align to sizeof(4xT)
). So [a... x b x T]
is generally not the same thing as [a ... x <b x T>]
unless b is 2’s power.
So here comes the problem in operation vector.type_cast
. It cast a memref<a... x b x T>
into memref<vector<a... x b x T>>
without touching the data inside. In MLIR the vector<a... x b x T>
is currently translated into [a... x <b x T>]
in LLVM. So if b is NOT a power of 2, this cast will cause very strange behavior.
Considering the example above:
memref.global "private" @gv0 : memref<2x3xi32> = dense<[[1, 2, 3], [4, 5, 6]]>
func.func @main() {
%mem0 = memref.get_global @gv0 : memref<2x3xi32>
%mem1 = vector.type_cast %mem0 : memref<2x3xi32> to memref<vector<2x3xi32>>
%v1 = memref.load %mem1[] : memref<vector<2x3xi32>>
%p12 = vector.extract %v1[1, 0] : vector<2x3xi32>
vector.print %p12 : i32
return
}
It will print out 0
instead of 6
because when translated into LLVM IR, LLVM assumes a [2 x <3 x i32>]
(%v1
here) will have <3 x i32>
aligned as 16 bytes (4 *sizeof(i32)
), so the vector.extract
operation in the example will be translated into an accession to address @gv0 + 16
instead of @gv0 + 12
, giving a wrong output. I put the translated LLVM IR in godbolt to see the generated assembly code and confirmed this.
IMO, vector.type_cast
should restrict its argument type to memref<a... x b x T>
where b is a power of 2, instead of accepting arbitrarily memref
. But since I am not very familiar with the design and excepted usage for vector.type_cast
, I decide to post here for more opinions on this strange behavior of vector.type_cast
before I begin to add this check into it. Once we reach a consensus, I am happy to write and submit a patch to fix it.