Hello,
We recently found a couple of places where IndexType’s bit width plays a role in transformations.
Specifically,
- The
IndexCastOfIndexCastpattern
// index_cast(index_cast(x)) -> x, if dstType == srcType.
def IndexCastOfIndexCast :
Pat<(Arith_IndexCastOp:$res (Arith_IndexCastOp $x)),
(replaceWithValue $x),
[(Constraint<CPred<"$0.getType() == $1.getType()">> $res, $x)]>;
- The
IndexCastUIOfIndexCastUIpattern
// index_castui(index_castui(x)) -> x, if dstType == srcType.
def IndexCastUIOfIndexCastUI :
Pat<(Arith_IndexCastUIOp:$res (Arith_IndexCastUIOp $x, $nneg1), $nneg2),
(replaceWithValue $x),
[(Constraint<CPred<"$0.getType() == $1.getType()">> $res, $x)]>;
OpFoldResult arith::IndexCastOp::fold(FoldAdaptor adaptor) {
// index_cast(constant) -> constant
unsigned resultBitwidth = 64; // Default for index integer attributes.
OpFoldResult arith::IndexCastUIOp::fold(FoldAdaptor adaptor) {
// index_castui(constant) -> constant
unsigned resultBitwidth = 64; // Default for index integer attributes.
IndexType’s bit width is not well defined
We understand that IndexType has had different meanings for different people:
index is “an integer the size of a pointer in the default address space”
Unsoundness of rewrites based on IndexType’s semantics above
Going back to the transformations mentioned above, under all interpretations listed here the canonicalization and folding patterns are unsound.
-
The folding patterns assume that the size of the index is 64
-
The canonicalization patterns are unsound when the one casts to type narrower than index and then back to index. Since the first cast is narrower, information can be lost during the cast. E.g., for index of infinite precision or i16
%0 = arith.constant 0x8000 : index
%1 = arith.index_cast %0 : index to i8
%2 = arith.index_cast %1 : i8 to index
// %2 = 0
Proposal for exact keyword in arith.index_cast and arith.index_castui.
We propose to reformulate the above unsound transformations into sound transformations by introducing the exact keyword. The exact keyword’s semantics are the following:
If the exact attribute is present, it is assumed that the index type width is such that the conversion does not lose information. When this assumption is violated, the result is poison.
The rewrites above can then be reformulated as:
// index_cast(index_cast(x, exact)) -> x, if dstType == srcType.
// The inner exact guarantees the iN -> index conversion is lossless,
// so the roundtrip through index preserves the value.
def IndexCastOfIndexCast :
Pat<(Arith_IndexCastOp:$res (Arith_IndexCastOp $x, $exact1), $exact2),
(replaceWithValue $x),
[(Constraint<CPred<"$0.getType() == $1.getType()">> $res, $x),
(Constraint<CPred<"(bool)$0">> $exact1)]>;
// index_castui(index_castui(x, exact)) -> x, if dstType == srcType.
// The inner exact guarantees the iN -> index conversion is lossless,
// so the roundtrip through index preserves the value.
def IndexCastUIOfIndexCastUI :
Pat<(Arith_IndexCastUIOp:$res
(Arith_IndexCastUIOp $x, $nneg1, $exact1), $nneg2, $exact2),
(replaceWithValue $x),
[(Constraint<CPred<"$0.getType() == $1.getType()">> $res, $x),
(Constraint<CPred<"static_cast<bool>($0)">> $exact1)]>;
OpFoldResult arith::IndexCastOp::fold(FoldAdaptor adaptor) {
// index_cast(constant, exact) -> constant
and
OpFoldResult arith::IndexCastUIOp::fold(FoldAdaptor adaptor) {
// index_castui(constant, exact) -> constant
For which some work was done in llvm/llvm-project#183395 and some discussion happened on llvm/llvm-project#184631
How to set exact?
Now the question is, when will these transformations execute if nobody is setting the exact keyword? We propose that there’s a pass available (e.g., arith-declare-index-bitwidth) for downstream users which will set exact. It will:
-
It’ll optionally update DLTI on the operation it’s called on (if it isn’t already set)
-
and then walk all the index_cast ops
-
for index_castui and index_cast if the target type is larger than or equal to the source type, then we set the
exactflag. -
for index_castui and index_cast if the target type is smaller than the source type, one needs to statically determine that the operand value fits in the target type, which could be done with a range analysis.
The DLTI dialect provides the DataLayoutSpecAttr and DataLayoutEntryAttr which allows the size of index to be explicitly set in the IR. For example:
module attributes { dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<index, 64 : i32>> } {
// .. a module where index is 64 bits wide.
}
This let’s users control:
-
when the folders and canonicalizations will run
-
make sure that the folders and canonicalizations are safe
-
without the pass being run, the folders and canonicalizations will not be executed as the operations would lack the
exactflag.
We also discussed the possibility of setting a lower bound and an upper bound for index’s bitwidth. This would allow compilers to apply these transformations even if the target is not precisely known. This matches the two conflicting semantics nicely:
-
If downstream projects consider index as an integer of infinite precision, one can set the index’s bitwidth arbitrarily high. (lower bound, and upper bound are the same)
-
for index_castui and index_cast if the target type is larger than or equal to the lower bound and the source type is equal or smaller to the lower bound, then we set the
exactflag. -
for index_castui and index_cast if the target type is smaller than the upper bound and the source type is larger than or equal to the upper bound (i.e., the value is being narrowed), one needs to statically determine that the operand value fits in the lower bound (i.e., it fits in the smallest bit width type) one can set the
exactflag.
Please let me know your thoughts and comments. Thanks!