Operation conversion with unequal numbers of SSA-results, best approach

Greetings everyone.
I am trying to create a conversion between two dialects and in doing so I have to substitute an operation with another, which has more output SSA-values than the original one.
I have to remap the output SSA-values with ConversionPatternRewriter::replaceAllUsesWith, but this leads to a number of in-place root updates, and subsequently, to a number of UnrealizedConversionCast operations for the operands, which have not been yet converted, as we “jump” from rewriting one operation to rewriting another.

I have been stuck at this for a while to the point I cannot understand what I am doing, and I think there are other ways of doing what I intend to do.
Could you please give any advice on how to manage such conversions?

Would this be what you’re looking for? llvm-project/mlir/include/mlir/Transforms/OneToNTypeConversion.h at main · llvm/llvm-project · GitHub

Sorry, I should have clarified the specifics.
Let’s say we have dialects Source and Target.
Source has an operation SourceOp with a single result.
Target has an operation TargetOp with three results, and I need to replace the single result of SourceOp with one exact result from TargetOp (there exists a 1->1 type conversion for the SourceOp result), while leaving two other TargetOp results intact.

I am afraid a N->1 type conversion might not be applicable here.

IIUC:

  // From
  %s1 = source.op (...)

  // To
  %t1, %t2, %t3 = target.op (...) // same arguments?

and you want to replaceAllUsersWith(%s1, %t1) while the other two values (%t2, %t3) are (so far) unused? If the types of %s1 and %t1 are the same, then I don’t understand why it should create unrealized casts.

Could it be an overzealous legalization of your operations? I had a case in an op that had multiple returns and iter_args and arguments and yield in a confusing way, and it took a while to juggle them around.

Maybe a way to test it is to lower to:

  %t1, %t2, %t3 = target.op(...)
  target.consume(%t2)
  target.consume(%t3)

in one go, and then replace all users of %s1 with %t1 and see if the casts remain. Not that this would be your final pattern, just to dig deeper into the legalization issue.

Isn’t this for the case where there would be a mapping between (%t1, %t2, %t3) to %s1 and vice versa?

The problem is a known issue with ReplaceAllUsesWith in the context of dialect conversion. @matthias-springer is currently fixing it, he may be able to suggest a workaround in the meantime!

Yes, I misunderstood the request and thought this is what was asked :slight_smile:

That’s correct.

The problems start when operations-users of %t1 have to be legalized right after the source.op: because in-place root updates “bypass” the normal order of legalization (when patterns are applied to operations in the same order as the operations themselves are placed in the IR) and operation-user of %t1 is processed, some of its operands might not be converted (they would have been converted, if it was the normal order of legalization), which creates unrealized casts, even though required patterns and type conversions exist.

  %s2 = source.someotherop(...) // This one hasn't been converted yet.
  %t1, %t2, %t3 = target.op(...)
  target.consume(%t2)
  target.consume(%t3)
  %casted = builtin.unrealized_conversion_cast(%s2) // That's where casts come into light.
  %s3 = source.useroperation(%t1, %casted)

I tried taking out all replaceOp calls and substituting them with replaceAllUsesWith (to ensure that everything is converted properly) and then creating a different pattern to erase all unrealized casts, but you might agree that is a horrible workaround.

I see, replaceAllUsersWith would not look into other ops, because the other ops aren’t converted yet.

Once you convert %s2 = source.someotherop to %t4 = target.someotherop, replaceAllUsers will replace the cast, too.

  %t4 = target.someotherop(...)
  %casted = builtin.unrealized_conversion_cast(%t4)
  %s3 = source.useroperation(%t1, %casted)

Then you replace the last one:

  %t4 = target.someotherop(...)
  %casted = builtin.unrealized_conversion_cast(%t4) // Note: %casted may have a different type here, needs to match what's in the target's type
  %t5 = target.useroperation(%t1, %casted)

and if %t4 and %casted are the same, the cast should go away on canonicalization.

Yes, that’s what I was trying to go for and what I described in the message above, but I must have done something wrong again and I somehow end up in an endless loop while cleaning up replacements set in finalize-method in DialectConversion.cpp.
For the time being, I will try to fix this.
Thank you all for your help.

Can you try patching in this change: [mlir][Transforms] Support `replaceAllUsesWith` in dialect conversion by matthias-springer · Pull Request #84725 · llvm/llvm-project · GitHub (You also need all dependent PRs.)

This adds support for replaceAllUsesWith to the dialect conversion. When a use is updated, the owning op is no longer treated as “modified in-place”, so it may solve the issue that you are seeing.

I would be curious how you managed to trigger an infinite loop in finalize. I’m not very familiar with this function, but it sounds like a bug in the implementation. This has never happened to me so far.

I am looking into it as soon as I can. Thank you.

I was mistaken and the problem occurs in applyRewrites:
somehow unrealized casts are included in replacements set (they shouldn’t be, as we access these casts via replaceAllUsesWith, not replaceOp), and are erased twice, because before that unrealized materializations were erased.

for (auto &mat : unresolvedMaterializations) {
    mat.getOp()->dropAllUses();
    mat.getOp()->erase();
  }

  for (auto &repl : llvm::reverse(replacements)) {
    repl.first->dropAllUses();
    repl.first->erase();
  }

I am afraid even without replacements before we erase materializations the casts need to be caninicalized, which isn’t the case now.

Hard to say what’s going on without being able to reproduce it locally. replaceAllUsesWith was not properly supported in the dialect conversion until the PR that I linked (but it exposed via RewriterBase), so it’s possible that it caused the dialect conversion to misbehave…

It also looks like you have an older MLIR version. The code that you mentioned no longer exists in that form. I made various changes to the dialect conversion over the last month; most of them NFC but some of them also fixing bugs, adding additional assertions, etc.

I do. I am going to patch everything in the meantime then. Thank you again for your help.