Like the discussion (here)[Adding integer ceil and floor in Std Dialect - #4 by AlexEichenberger], when we deal with dynamic shapes, we cannot eliminate the first select.
‘The first selects could be eliminated by having their test being a compile time constant “true” or “false”.’
#map = affine_map<()[s0] -> (s0 ceildiv 64)>
^bb0(%arg0: index, %arg1: index, %arg2: index): // no predecessors
%c1 = constant 1 : index
%0 = affine.apply #map()[%arg0]
%1 = affine.apply #map()[%arg1]
hal.return %0, %1, %c1 : index, index, index
}
will lower to:
^bb0(%arg0: index, %arg1: index, %arg2: index): // no predecessors
%c1 = constant 1 : index
%c0 = constant 0 : index
%c64 = constant 64 : index
%0 = cmpi sle, %arg0, %c0 : index
%1 = subi %c0, %arg0 : index
%2 = subi %arg0, %c1 : index
%3 = select %0, %1, %2 : index
%4 = divi_signed %3, %c64 : index
%5 = subi %c0, %4 : index
%6 = addi %4, %c1 : index
%7 = select %0, %5, %6 : index
%8 = cmpi sle, %arg1, %c0 : index
%9 = subi %c0, %arg1 : index
%10 = subi %arg1, %c1 : index
%11 = select %8, %9, %10 : index
%12 = divi_signed %11, %c64 : index
%13 = subi %c0, %12 : index
%14 = addi %12, %c1 : index
%15 = select %8, %13, %14 : index
hal.return %7, %15, %c1 : index, index, index
}
expected:
^bb0(%arg0: index, %arg1: index, %arg2: index): // no predecessors
%c1 = constant 1 : index
%c0 = constant 0 : index
%c64 = constant 64 : index
%3 = subi %arg0, %c1 : index
%4 = divi_signed %3, %c64 : index
%7 = addi %4, %c1 : index
%11 = subi %arg1, %c1 : index
%12 = divi_signed %11, %c64 : index
%15 = addi %12, %c1 : index
hal.return %7, %15, %c1 : index, index, index
}
Adding a sindex/uindex may also be an option (to match signed/unsigned ints of various bit widths) as then an affine.apply on a uindex could insert the unsigned ops and we could just add more unsigned op folders as needed.We can also add unsigned_ceildiv operation. I prefer the latter, which may have fewer changes. I am very interested in doing this.