Dim in presence of maps

Trying to get the maps to work with mapped memory and I have a question relating to std.dim

    %A = alloc() : memref<5x10xi64>
    %c0 = constant 0 :index
    %c1 = constant 1 :index
    %res = constant 0 : i64
    %d0 = dim %A, %c0 : memref<5x10xi64>
    %d1 = dim %A, %c1 : memref<5x10xi64>
    affine.for %i = 0 to %d0 {
        affine.for %j = 0 to %d1 {
            affine.store %res, %A[%i, %j] : memref<5x10xi64>

The code will deliver the right values for the ranges of i and j, respectively 5 and 10. Now consider the same code with a map that transpose the data layout, for example (col-major).

    #map = affine_map<(d0, d1) -> (d1, d0)>

    %A = alloc() : memref<5x10xi64, #map>
    %c0 = constant 0 :index
    %c1 = constant 1 :index
    %res = constant 0 : i64
    %d0 = dim %A, %c0 : memref<5x10xi64, #map>
    %d1 = dim %A, %c1 : memref<5x10xi64, #map>
    affine.for %i = 0 to %d0 {
        affine.for %j = 0 to %d1 {
            affine.store %res, %A[%i, %j] : memref<5x10xi64, #map>

This code does not lower to LLVM at this time as --normalize-memrefs does not handle dim. This raise the question, though, of what dim means in the presence of maps.

When used in the context of determining loop iterations, which are in the logical domains, the map is irrelevant as maps change the data layout and not the iteration space.

When used in the context of determining data sizes, which deals with the physical data layout of the values, then the map is relevant and should be taken into account.

So intuitively, one would expect to have two dim operators, one for each context. Apparently, the dim lower to an LLVM as an access to the memref data structure, which only store the physical data layout. Possibly a naive view is that dim should not be a property of the data, but of the type.

Is it something for the std dialect? Or something for the shape dialect?

storeaffine.store. It should all then work. The unrestricted store is an issue. The dim should disappear on -canonicalize, and -normalize-memrefs shouldn’t have to worry about it here.

In spite of this, normalizeMemRefs/replaceAllMemRefUsesWith should actually be updated to replace uses in dim. I think this should be an easy (single line) fix, and will help with dynamically shaped memrefs.

storeaffine.store . It should all then work.

fixed the original message, thanks.

In spite of this, normalizeMemRefs / replaceAllMemRefUsesWith should actually be updated to replace uses in dim .

If you apply the map to the memref in the dim, then you get the physical data dimensions of the array.

#map = affine_map<(d) -> (d floordiv 4, d mod 4)>
%A = alloc() : memref<12xf32, #map>
%d = dim %A, 0 : memref<12xf32, #map>

When using %d in the iteration domains, we need 12. Applying the map to the dim operations, this would transform it to %d = dim %A, 0 : memref<3x4xf32> == 3. This would result in a loop iterating over 3 elements instead of the 12 logical elements.

My points is that we probably need both the logical (unmapped) and the physical (mapped) dimensions. Supporting the mapped dimensions appears to be easy, the logical unmapped dimensions, not sure.

Hi Alex,

The replacement of %d = dim %A, 0 : memref<12xf32, #map> to %d = dim %A, 0 : memref<4x4xf32> would be incorrect as you point out. (I take back the earlier statement that the replacement in dim ops is a one line fix!) However, you don’t need dim to be changed. The dim has to be replaced here by looking at the inverse mapping / remapping. As long as your layout maps are one-to-one (which they are for behavior to be defined), this is possible. In this case, on the rewrite/replacement of dim, the operation will just fold to the constant 12 (even post replacement).

0 <= d0 <= 11
(t1, t2) = (d0 / 4, d0 mod 4)
Inverse map: d0 = 4t1 + t2

Replacement code

%t1 = dim %A', 0 :  // This will yield 3.
%t2 = dim %A', 1 :  // This will yield 4.
%d = 4*(%t1 - 1) + (%t2 - 1) + 1  // This will yield 12.

All of this will only be needed for dynamic shapes. For static ones, the dim will just fold to a constant. The replacement itself can call the folder on it and on folding would just replace %d by 12 for a static shape.

Hi Uday,

When we map with ceil/floor, we loose information. If I choose a size of 14 instead in the code below:

#map = affine_map<(d) -> (d floordiv 4, d mod 4)>
%A = alloc() : memref<14xf32, #map>
%d = dim %A, 0 : memref<14xf32, #map>
affine.for %i = 0 to %d { ... } 

becomes this after memref lowering (using your suggestion)

#map = affine_map<(d) -> (d floordiv 4, d mod 4)>
%A = alloc() : memref<4x4xf32>
%d0 = dim %A, 0 : memref<4x4xf32>
%d1 = dim %A, 1 : memref<4x4xf32>
%d = 4 * %d0 + %d1
affine.for %i = 0 to %d { ... } 

The bound is now 16, because of the ceil operation needed to determine the number of 4xf32 tiles.
The only way to do it, in my opinion, is to “strip” the map from the dim. Maybe this can be done by simply modifying the “verification” of the type linkage between the memref passed as reference to dim, and the type provided to the dim.

I see - thanks. To make it clearer, let’s just look at a dynamic memref memref<?xf32> (since the replacement isn’t needed for a static memref). The ? isn’t known - so there’s no way to recover the information just from the new memref + the map. Thinking about it, actually you don’t need any replacement at all for dim. For a dim on a dynamic memref, you would just pass the symbol from the alloc site and the dim still gets folded away. If the memref isn’t locally defined through an alloc, then arguments or return values would have to be rewritten to pass the symbol bound to the dynamic dimension. So, in all cases, the dim will have to be and can be folded away except when the memref is received from an external function. In the latter case, there’s anyway no choice other than to enforce it in the ABI.