I’m trying to modify a project to reuse the existing memref dialect, but I’m encountering some design issues during type conversion. I will try to explain them through some reduced examples that replicate the problems I’m facing. I’m sorry for the long posts, but I want to be clear as possible, also with respect to doubts about my implementation.
Consider the following IR, based on a custom dialect called mydialect
in which an array-like has been defined, together with some operations able to operate on it.
%0 = mydialect.alloc : !mydialect.array<3xi64>
%1 = mydialect.process %0 : !mydialect.array<3xi64> -> !mydialect.array<6xi64>
// rest of code, with other uses of %1
Now, suppose that I have a conversion pass converting just a subset operations (i.e. only process
).
Such conversion pass uses a type converter telling that the mydialect.array type should become a memref. While converting the timestwo
operation, a mydialect.alloc
op has to be created in order to store the results. So, after the conversion, we are in the following situation:
%0 = mydialect.alloc : !mydialect.array<3xi64>
%1 = builtin.unrealized_conversion_cast %0 : !mydialect.array<3xi64> to memref<3xi64>
%2 = mydialect.alloc : !mydialect.array<6xi64>
%3 = builtin.unrealized_conversion_cast %2 : !mydialect.array<6xi64> to memref<6xi64>
// Code to populate %3
Now I want to convert my array-allocating operations, and to achieve this I map the mydialect.alloc operation to memref’s one. A new unrealized cast is introduced automatically because of the existing usage within the already existing cast (%1 in the previous IR).
%0 = memref.alloc : memref<3xi64>
%1 = builtin.unrealized_conversion_cast %0 : memref<3xi64> to !mydialect.array<3xi64>
%2 = builtin.unrealized_conversion_cast %1 : !mydialect.array<3xi64> to memref<3xi64>
%3 = memref.alloc : memref<6xi64>
%4 = builtin.unrealized_conversion_cast %3 : memref<6xi64> to !mydialect.array<6xi64>
%5 = builtin.unrealized_conversion_cast %4 : !mydialect.array<6xi64> to memref<6xi64>
// Code to populate %5
Finally, I convert the memrefs into LLVM structs:
%0 = llvm.mlir.undef : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
// Code to populate %0. Skipping as not useful.
%1 = builtin.unrealized_conversion_cast %0 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<3xi64>
%2 = builtin.unrealized_conversion_cast %1 : memref<3xi64> to !mydialect.array<3xi64>
%3 = builtin.unrealized_conversion_cast %2 : !mydialect.array<3xi64> to memref<3xi64>
%4 = llvm.mlir.undef : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%5 = builtin.unrealized_conversion_cast %4 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<6xi64>
%6 = builtin.unrealized_conversion_cast %5 : memref<6xi64> to !mydialect.array<6xi64>
%7 = builtin.unrealized_conversion_cast %6 : !mydialect.array<6xi64> to memref<6xi64>
// Code to populate %7
// Somewhere in this code there will be unrealized casts converting %3 and %7 into LLVM structs, because the memref.load and memref.store operations have been converted but they were using as memref values the ones coming out from the unrealized casts.
So, in the end, I get a chain of casts that is semantically valid (struct → memref → mydialect.array → memref → struct) but can’t be folded because the casts reconciliation pass only considers them in pair.
Some possible objections:
- Q: The
mydialect.process
operation should not generate the secondmydialect.alloc
while being converted.
A: Why not? The semantics of the operation is to create a new array containing the computed values, and the way to create such array is provided by themydialect.alloc
operation. - Q: Why are you not converting the
mydialect.alloc
operation together with themydialect.process
operation? This way the unrealized casts would not be introduced and you would be fine!
A: Because keeping its conversion separated allows me to possibly implement different lowering strategies, without the need of copy-pasting the alloc-conversion logic in each pass. From how I see the partial conversion infrastructure, this should be not only allowed but also encouraged. - Q: You should run the cast reconciliation pass after the conversion to the memref dialect.
A: The answer to this requires a modification to the previous example, so follow me two more minutes and please tell me if the explanation is not clear.
Suppose that the mydialect.process
operation must be lowered straight to the LLVM dialect. This implies that the conversion pass is using a type converter that is able to obtain the LLVM representation of the mydialect.array
type, that in our case consists in the chaining of the type conversions we have seen so far. In other words, the conversion of mydialect.array
yields an LLVM struct.
After the first conversion we get the following IR:
%0 = mydialect.alloc : !mydialect.array<3xi64>
%1 = builtin.unrealized_conversion_cast %0 : !mydialect.array<3xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%2 = mydialect.alloc : !mydialect.array<6xi64>
%3 = builtin.unrealized_conversion_cast %2 : !mydialect.array<6xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
// Code to populate %3 (i.e. an llvm.call to an external function)
After the allocs conversion, we obtain:
%0 = memref.alloc : memref<3xi64>
%1 = builtin.unrealized_conversion_cast %0 : memref<3xi64> to !mydialect.array<3xi64>
%2 = builtin.unrealized_conversion_cast %1 : !mydialect.array<3xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%3 = memref.alloc : memref<6xi64>
%4 = builtin.unrealized_conversion_cast %3 : memref<6xi64> to !mydialect.array<6xi64>
%5 = builtin.unrealized_conversion_cast %4 : !mydialect.array<6xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
// Code to populate %5 (i.e. an llvm.call to an external function)
And finally, after the memref conversion:
%0 = llvm.mlir.undef : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
// Code to populate %0. Skipping as not useful.
%1 = builtin.unrealized_conversion_cast %0 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<3xi64>
%2 = builtin.unrealized_conversion_cast %1 : memref<3xi64> to !mydialect.array<3xi64>
%3 = builtin.unrealized_conversion_cast %2 : !mydialect.array<3xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%4 = llvm.mlir.undef : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%5 = builtin.unrealized_conversion_cast %4 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<6xi64>
%6 = builtin.unrealized_conversion_cast %5 : memref<6xi64> to !mydialect.array<6xi64>
%7 = builtin.unrealized_conversion_cast %6 : !mydialect.array<6xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
// Code to populate %7 (i.e. an llvm.call to an external function)
You can see how there is no possibility of applying an intra-pipeline cast reconciliation pass. Still the chain of casts is valid (struct → memref → mydialect.array → struct).
- Q: Related to (3): you should not go straight to the LLVM dialect.
A: Again, why not? Even though the conversion to LLVM is often seen as the final step, I don’t see any reason for which some operations may require it during their conversion, even before the conversion pipeline is finished (i.e. the conversion of an operation requires to pass an opaque pointer to an external function, and I see no reason to populate my dialect with a opaque-pointer-like type).
Thanks for reading, I would really like to see your thoughts about this problem in order to understand if maybe the reconciliation pass itself should be modified to handle this situation.