To set up the example, I have two allocations memref<3xi8> and memref<2x3xf32>. For reducing allocation cost, I allocated the two buffers together as memref<27xi8>, in the hope to slice them up later.
How do I exactly get the second slice? My observation is that, there is no way to express “skip the first 3 bytes and take the rest 24 bytes as a slice” with view/subviews.
// RUN: mlir-opt %s -convert-std-to-llvm
func @views() {
%0 = alloc(): memref<27 x i8>
%1 = view %0[][]: memref<27 x i8> to memref<3xi8>
%2 = view %0[][]: memref<27 x i8> to memref<2x3xf32, offset: 3, strides: [3, 1]>
std.return
}
However you are right that there is a bug here: the offset 3 is going to be applied in multiples of f32 but what we want is that the view pointer is shifted by 3 bytes.
This calls for some fixes and should probably look like this instead:
func @views() {
%0 = alloc(): memref<27 x i8>
%1 = view %0[][]: memref<27 x i8> to memref<3xi8>
%2 = view %0[%c3][%c2, %c3]: memref<27 x i8> to memref<2x3xf32>
std.return
}
The lowering of a view op should apply the shift to the underlying buffer and can only produce contiguous memref without offset in the type (i.e. offset within the type should not refer to something outside the type ).
A consequence would be that the canonicalizations of the offsets would not be possible (cc @andydavis).
It intends to, but fails the ViewOp verifier. This code is what I’m currently investigating. Notice that ViewOp doesn’t take inputs with non-identity affine maps at [1].
There is a possibility to incorporate strides as well. I imagine the same conclusion for strides, which calls for all offsets, sizes, and strides to be completely decoupled from affine maps. I’ll prototype a fix like this since it’s blocking my work.