[RFC] Semantics of `vector.gather` Indices with Strided MemRefs

I am coming around to the logical indices interpretation a bit more (it makes handling subviews a bit cleaner) but I do have one other point to consider: is vector.load is physically indexed?

That is, what do we expect a vector.load from a non-unit-strided memref to do? Is

vector.load %base[%offset] : memref<4xf32, strided<[2]>>, vector<2xf32>` 

going to return the elements at physical indices <%offfset, %offset + 1> or <%offset, %offset + 2>?

I had the sense that, with how low-level vector.load was, the answer was the former, and that if you want the logical format you want vector.transfer_read, which’ll handle this sort of thing when lowered to low-level operations.

(The lack of an analogous operation for gather is why I think we should land @Groverkss’s [RFC] Improving gather codegen for Vector Dialect upstream.

That is to say, my main argument for the physical interpretation is that it matches how (I understand) the other low-level operations (load, store, maskedload, etc.) to work, and that the logical interpretation should go thorough a higher-level abstract operation.