[MLIR][Vectorize] Divsion with reminder

I want to know what happens when divide vector dimension size by vectorization factor has a remainder.

such as:

  affine.for %arg2 = 0 to 64 step 5 {
      affine.for %arg3 = 0 to 64 step 4 {
        %cst = arith.constant 0.000000e+00 : f32
        %0 = vector.transfer_read %arg0[%arg2, %arg3], %cst : memref<64x64xf32>, vector<5x4xf32>
        %cst_0 = arith.constant 0.000000e+00 : f32
        %1 = vector.transfer_read %arg1[%arg2, %arg3], %cst_0 : memref<64x64xf32>, vector<5x4xf32>
        %2 = arith.addf %0, %1 : vector<5x4xf32>
        vector.transfer_write %2, %alloc[%arg2, %arg3] : vector<5x4xf32>, memref<64x64xf32>
      }
    }

If there is reminder after division by VF you should see masked vectorization in effect.

But I didn’t find any implementation of masked vectorization in Affine’s SuperVectorize.

some masking support is there as can be seen in example provided in the supervectorize.cpp file but maybe doesnt cover all cases ?

#map = affine_map<(d0) → (-d0 + 500)>
func @vecred(%arg0: memref<512xf32>) → f32 {
%cst = arith.constant 0.000000e+00 : f32
%cst_0 = arith.constant dense<0.000000e+00> : vector<128xf32>
%0 = affine.for %arg1 = 0 to 500 step 128 iter_args(%arg2 = %cst_0)
→ (vector<128xf32>) {
// %2 is the number of iterations left in the original loop.
%2 = affine.apply #map(%arg1)
%3 = vector.create_mask %2 : vector<128xi1>
%cst_1 = arith.constant 0.000000e+00 : f32
%4 = vector.transfer_read %arg0[%arg1], %cst_1 :
memref<512xf32>, vector<128xf32>
%5 = math.cos %4 : vector<128xf32>
%6 = arith.addf %arg2, %5 : vector<128xf32>
// We filter out the effect of last 12 elements using the mask.
%7 = select %3, %6, %arg2 : vector<128xi1>, vector<128xf32>
affine.yield %7 : vector<128xf32>
}
%1 = vector.reduction , %0 : vector<128xf32> into f32
return %1 : f32
}

Got it, thank you.And the vectorizing reductions is supported only for 1-D vectors.