Adding unsigned integer ceil and floor in Std Dialect

Like the discussion (here)[Adding integer ceil and floor in Std Dialect - #4 by AlexEichenberger], when we deal with dynamic shapes, we cannot eliminate the first select.

‘The first selects could be eliminated by having their test being a compile time constant “true” or “false”.’

#map = affine_map<()[s0] -> (s0 ceildiv 64)>
^bb0(%arg0: index, %arg1: index, %arg2: index):  // no predecessors
  %c1 = constant 1 : index
  %0 = affine.apply #map()[%arg0]
  %1 = affine.apply #map()[%arg1]
  hal.return %0, %1, %c1 : index, index, index
}

will lower to:

^bb0(%arg0: index, %arg1: index, %arg2: index):  // no predecessors
    %c1 = constant 1 : index
    %c0 = constant 0 : index
    %c64 = constant 64 : index
    %0 = cmpi sle, %arg0, %c0 : index
    %1 = subi %c0, %arg0 : index
    %2 = subi %arg0, %c1 : index
    %3 = select %0, %1, %2 : index
    %4 = divi_signed %3, %c64 : index
    %5 = subi %c0, %4 : index
    %6 = addi %4, %c1 : index
    %7 = select %0, %5, %6 : index
    %8 = cmpi sle, %arg1, %c0 : index
    %9 = subi %c0, %arg1 : index
    %10 = subi %arg1, %c1 : index
    %11 = select %8, %9, %10 : index
    %12 = divi_signed %11, %c64 : index
    %13 = subi %c0, %12 : index
    %14 = addi %12, %c1 : index
    %15 = select %8, %13, %14 : index
    hal.return %7, %15, %c1 : index, index, index
  }

expected:

^bb0(%arg0: index, %arg1: index, %arg2: index):  // no predecessors
  %c1 = constant 1 : index
  %c0 = constant 0 : index
  %c64 = constant 64 : index
  %3 = subi %arg0, %c1 : index
  %4 = divi_signed %3, %c64 : index
  %7 = addi %4, %c1 : index
  %11 = subi %arg1, %c1 : index
  %12 = divi_signed %11, %c64 : index
  %15 = addi %12, %c1 : index
  hal.return %7, %15, %c1 : index, index, index
}

Adding a sindex/uindex may also be an option (to match signed/unsigned ints of various bit widths) as then an affine.apply on a uindex could insert the unsigned ops and we could just add more unsigned op folders as needed.We can also add unsigned_ceildiv operation. I prefer the latter, which may have fewer changes. I am very interested in doing this.

Same observation here as well. I have quite a few cases that can’t avoid using signed div/mod which makes further optimization unavailable.
I suppose you meant to add unsigned operation in the affine expression, not in std dialect.
More specifically, ‘uceildiv, ufloordiv and umod’ are needed.

and we may not want to have uindex/sindex

1 Like

Thank you, I very much agree with your point of view. I don’t think we need to add new types. I think we need to add some operations like SignedFloorDivIOp. We can know what types of data are processed and avoid unnecessary instructions.

I am not sure what is the concrete proposal here. Standard dialect no longer contains arithmetic operations. I won’t be opposed to adding unsigned ceildiv and floordiv to the arithmetic dialect, at least for symmetry with the signed versions it already contains. The remainder operation is already available in the arithmetic dialect in both signed and unsigned form.

1 Like

imo, we need ‘uceildiv, ufloordiv and umod’ in affine expression.

@lipracer do you think this can address your issue?

As much as adding operations to the arithmetic dialect is fine with me, extending affine expressions sounds quite problematic. How do you plan to support these new expressions in FlatAffineConstraints? What is the semantics of the PresburgerSet defined with such expressions?

cc: @bondhugula

I can try to provide a PR to add the CeilDivSIOp and FloorDivUIOp operation in the arithmetic dialect.

Currently we lack of ability to express unsigned operation in the affine expression itself.
For example, ceildiv in affine expression is only treated as signed op and simply lowered using sdiv currently. How can we make it unsigned using unsigned ceildiv in arithmetic dialect?

I didn’t expect a big issue since this isn’t adding totally new operations but only adding unsigned version of the existing operations but I’m not sure.

+1 on this, especially since you have a good motivating example.

Yes. We already have arith.remui and arith.remsi.

Thanks for reminding.

1 Like

Why do you need these in AffineExpr specifically?

AffineExpr has evolved from and still supports a significant amount of polyhedral optimization analyses of passes. None of them are designed for unsigned operations and I cannot predict in which ways they will break. I wouldn’t mind adding these operations as long as there is a reasonable argument on why it doesn’t break anything.

The original example in this thread shows the reason why unsigned operation is needed in the affine expression, it makes the index signed even when user doesn’t want it to be. Further potential compiler optimizations are disabled by that.

Main point is to avoid use of signed operation if possible. Maybe I’m missing something, please let me know if there is a way to keep index unsigned when using affine map which includes ceilingdiv/floordiv/mod.
I thought adding operations elsewhere doesn’t help solving the original problem, that needs to be clarified first. I might be wrong

Agree, it’s not easy change and hard to expect what could be broken.
I can’t say introducing unsigned operation will not break anything but I would say it’s not very likely to cause significant issues. For example, it shouldn’t break anything if we only use a number within the half range of the unsigned integer which belongs to signed integer.

Adding a new op that meets your requirements, especially when these new ops are “missing puzzle pieces” (i.e. we have ceildivisi but not ceildivui) is definitely the better way to go.

Now I’m really confused.
Let’s see the affine map in the original example above.

#map = affine_map<()[s0] → (s0 ceildiv 64)>

How can we lower this ceildiv to unsigned ceildiv when s0 is signless integer?
Actually we don’t really need a new operation in the arithmetic dialect, currently it’s lowered using arith::DivSIOp instead of ceildivisi.

Same to mod, remui can’t be used for lowering Affine map because AffineExpr has only one version of ‘mod’ operation.

It does not demonstrate why do you insist on using affine.apply just to express ceildiv. This example can be expressed perfectly fine using a arith.ceildivu operation should it become available.

Are you sure all of MLIR users only use half of the range and will be okay with such implicit limitation? The difference between “not very likely” and “here is why it will not” is exactly what makes me object to this change. Maybe you can point me to projects or papers on polyhedral compilation that use unsigned arithmetics, because that’s what this change is not very likely but still quite possibly will break.

This is spot on. Even if you start with affine expressions, what you want is lowering those to unsigned arithmetic operations, not different kinds of operations in the expression itself. You can provide new patterns for that and/or extend AffineApplyExpander (llvm-project/mlir/lib/Conversion/AffineToStandard/AffineToStandard.cpp at f639882be8883d9dee6ed291852b721e10103a0d · llvm/llvm-project · GitHub) to assume LHS of division-like operations is unsigned (the RHS must be strictly positive in any case) and use them to lower your expressions instead of using the current lowering.

Yes, this is going to be problematic. I’d like to better understand the existing issues.

In this case, do you have information that s0 is non-negative? That information can perhaps be exploited to specialize the downstream lowering (using the necessary arith dialect operations) like @ftynse points to AFAIU.

Please review implementation as you see fit here.
https://reviews.llvm.org/D113363

1 Like

No, I’ve read the original problem differently.
The main problem @lipracer wanted to address in the above example is, to eliminate the ‘select’ instruction in the lowered code (or potentially in the further optimization step). And the reason for having ‘select’ and ‘divi_signed’ was it’s lowered from ceildiv in affine expression in the given example, not from arith.ceildivisi.

For that, what I kept trying is to point that adding arith.ceildiviu doesn’t look helpful especially to the original problem where the code is applying affine map including affine_expression_ceildiv.
Still I’m not sure what’s the main proposal here, title itself simply says adding unsigned ceil/floor div arithmetic operation but the description seems to require something different.
I presumed the latter because, technically ceildiv operation in arithmetic dialect is not necessary and not even used in the lowering code there, they can be expressed using other operations.

I really wanted to clarify that first, surely no reason to discuss affine expression here if the original problem only requires to add new unsigned operations in arithmetic dialect.

This is the point it’s getting tricky, s0 could to be an index type which is signednessless integer and we need to invent something else to hold the signedness information. My first thought was the operation itself as usual i.e., having signed/unsigned operations but that might not be plausible as @ftynse commented.
And the suggested workaround might enough to me, let me try it and report back if I encounter any unexpected situation.

I can’t insist more on modifying the affine expression in any case, because I’m not an expert to the polyhedral compilation and I don’t have an answer to the ‘proof of it doesn’t break anything’. Not being sarcastic but being convinced that currently adding support for the unsigned operation is not trivial.

It is true that ceildivui does not help the ‘affine.apply’ lower mentioned above. I just consider it from the arithmetic dialect itself. Ceildivui is the same type of operation as div, rem, and floor. In terms of symmetry, I think I need this Operation.