Dialect conversion: type materialization

I’ve a question about type conversion: in my conversion patterns, i convert the operations of MyDialect into one or more LLVM operations, and later in the same pattern I need to apply another MyDialect operation to the result of a LLVM generated operation.
Problem is that the MyDialect operation expects arguments with types within MyDialect itself, and not the MLIR standard ones. How can I achieve that? I tried implementing the source materializations inside my type converter, but they just give back the value with the MLIR type instead of the MyDialect one, so they are useless.

Also, is there any practical example about argument / source / target materialization? Seems like the only one in the whole web is inside std → llvm conversion and just creates an llvm.cast operation.

Thanks

It should be possible to insert unrealized_conversion_cast (llvm-project/BuiltinOps.td at 9d7be77bf91e97e30fd680a07d09eb9f5d78c5bd · llvm/llvm-project · GitHub) to convert between your dialect and LLVM dialect. This will require to add an extra materialization that catches your types, AFAIR materializations are called in the order inverse to that in which they are called, so just adding it on top of the existing LLVMTypeConverter should be okay. I would be happy if the FIXME in it is addressed though llvm-project/StandardToLLVM.cpp at 37eca08e5bcfbe926176412a4a0acf5b963da7e6 · llvm/llvm-project · GitHub :slight_smile:

Is there any example around? I don’t think I’m getting what I should do. I checked also the toy example but it doesn’t show how to properly implement a type converter, and I think I’m doing it wrong…

I tried your solution and one suggested by Mehdi, but dead end for both:
If inside my pattern I use UnrealizedConversionCastOp it then turns out to don’t have any conversion pattern.
If I use llvm::DialectCastOp, instead, I get a generic “error: unsupported cast”.

Furthermore, I can’t understand why I sometimes get automatically generated llvm.cast operations. I saw that they are generated by the standard type converter (the link you sent), but at the end of the lowering I get strange behaviours, like first having a i64 → index conversion and right after a index → i64 one, which both lead the conversion to fail being illegal. For now I tackled this problem by using this trick, but I feel it to be tremendously bad.

EDIT:
I will post here some snippets of my code, so that it will hopefully facilitate the communication. I’ll trim some parts for brevity.

MyDialect’s BooleanType:

class BooleanType : public mlir::Type::TypeBase<BooleanType, mlir::Type, mlir::TypeStorage> {
	public:
	using Base::Base;
	static BooleanType get(mlir::MLIRContext* context);
};

TypeConverter:

class TypeConverter : public mlir::LLVMTypeConverter
{
	public:
	TypeConverter(mlir::MLIRContext* context) : mlir::LLVMTypeConverter(context, options)
	{
		addConversion([&](BooleanType type) { return convertMyTypeToLLVM(type); });
		// ... and many others (i know there is native i1 type, but this is not the point) ...

		// From flang compiler:
		addTargetMaterialization(
				[&](mlir::OpBuilder &builder, mlir::Type resultType, mlir::ValueRange inputs, mlir::Location loc) -> llvm::Optional<mlir::Value> {
					if (inputs.size() != 1)
						return llvm::None;

					return inputs[0];
				});

		addSourceMaterialization(
				[&](mlir::OpBuilder &builder, mlir::Type resultType, mlir::ValueRange inputs, mlir::Location loc) -> llvm::Optional<mlir::Value> {
					if (inputs.size() != 1)
						return llvm::None;

					return inputs[0];
				});
	}
}

MyDialect’s EqOp conversion pattern:

class EqOpLowering: public mlir::OpConversionPattern<EqOp>
{
	mlir::LogicalResult matchAndRewrite(EqOp op, llvm::ArrayRef<mlir::Value> operands, mlir::ConversionPatternRewriter& rewriter) const override
	{
		mlir::Location location = op.getLoc();
		EqOp::Adaptor adaptor(operands);

		mlir::Value result = rewriter.create<mlir::LLVM::FCmpOp>(location, mlir::LLVM::FCmpPredicate::oeq, adaptor.lhs(), adaptor.rhs());
		// result has type i1, but I need it to be mydialect.boolean, in order to pass it to AnotherOp.
		// If I keep i1, then the AnotherOp's conversion pattern will not see a BooleanType and thus won't know what to do. I know I can just accept also i1, but I don't believe this to be a good solution
		rewriter.replaceOpWithNewOp<AnotherOp>(op, result);

		return mlir::success();
	}
};

A small update:
I temporarily removed the ugly trick from the type converter and I found out that if I set the llvm::DialectCastOp as illegal, its conversion pattern is applied. Otherwise it is not. Anyway I find this extremely counterintuitive: if I’m converting to LLVM and thus set the dialect as legal, I shouldn’t worry about having to set some of its operations as illegal.
Anyway now I’m facing another trouble: at some time during the conversion process I get a recursive call to the materialization function and thus get recursive llvm::DialectCastOps

    //===-------------------------------------------===//
    Legalizing operation : 'std.br'(0x7fffebd10180) {
      "std.br"(%60)[^bb4] : (!mydialect.bool) -> ()

      * Fold {
      } -> FAILURE : unable to fold

      * Pattern : 'std.br -> ()' {
        ** Insert  : 'llvm.mlir.cast'(0x7fffebd10558)
        ** Insert  : 'llvm.br'(0x7fffebd0e600)
        ** Replace : 'std.br'(0x7fffebd10180)

        //===-------------------------------------------===//
        Legalizing operation : 'llvm.mlir.cast'(0x7fffebd10558) {
          %61 = "llvm.mlir.cast"(%60) : (!mydialect.bool) -> i1

          * Fold {
          } -> FAILURE : unable to fold

          * Pattern : 'llvm.mlir.cast -> ()' {
            ** Insert  : 'llvm.mlir.cast'(0x7fffebd0e6a8)
            ** Replace : 'llvm.mlir.cast'(0x7fffebd10558)

            //===-------------------------------------------===//
            Legalizing operation : 'llvm.mlir.cast'(0x7fffebd0e6a8) {
              %61 = "llvm.mlir.cast"(%60) : (!mydialect.bool) -> i1

              * Fold {
              } -> FAILURE : unable to fold

              * Pattern : 'llvm.mlir.cast -> ()' {
              } -> FAILURE : pattern was already applied
            } -> FAILURE : no matched legalization pattern
            //===-------------------------------------------===//
          } -> FAILURE : generated operation 'llvm.mlir.cast'(0x00007FFFEBD0E6A8) was illegal
        } -> FAILURE : pattern failed to match
      } -> FAILURE : no matched legalization pattern
      //===-------------------------------------------===//
    } -> FAILURE : generated operation 'llvm.mlir.cast'(0x00007FFFEBD10558) was illegal
  } -> FAILURE : pattern failed to match
} -> FAILURE : no matched legalization pattern

IREE might have some, but I don’t know where to look. @_sean_silva ?

If you declare it legal in the ConversionTarget, conversion patterns shouldn’t be necessary.

This op only supports casts between some built-in types and LLVM types. I suppose we may want to replace it with unrealized_conversion_cast at some point, but this has not been discussed.

The op should be declared legal, at which point the conversion is legal. The op further may need a folder to remove such useless back-and-forth casts. We can also try updating the materialization so that it avoids inserting the back-cast of the operand is defined by a forth-cast, and let DCE in canonicalization remove the forth-cast if possible.

Hmm, I find this reasonable. If you declare the entire dialect legal, all ops in that dialect are legal, including the DialectCastOp. When a pattern produces such an op, it is legal so there is no reason why this op should be converted further. We wouldn’t expect llvm.intr.fma to be converted further even if we had the pattern splitting it to fmul and fadd… The infrastructure does precisely what you ask it to do: produce legal operations from the LLVM dialect. If you want it not to produce some operations from the LLVM dialect, then you must specify which ones by declaring them illegal.

Part of the logical reasoning you may be missing is how partial conversions are staged. There may be many conversion passes that introduce (and sometimes clean up) dialect casts. This is totally fine as long as the final conversion pass makes sure that all casts are resolved, i.e. removed because their LHS and RHS have the same type. For these passes, it makes sense to have the cast op as legal, potentially dynamically legal depending on the types they try to lower out. Only the final pass needs to rewrite the remaining casts, in which case the cast op should be unconditionally illegal.

Do you have a functioning reproducer?

In any case, do not expect DialectCastOp to work for your dialect, it won’t.

We don’t (yet) have examples of unrealized_conversion_cast in IREE.

In this case, you need to manually insert the conversion (source materialization) to the MyDialect type before passing it to the MyDialect operation created inside your pattern – the infra doesn’t do this for you.

Btw, if you haven’t, I recommend you see my talk “Type Conversions the Not-So-Hard Way: MLIR’s new composable bufferize passes” on Talks - MLIR

Didn’t thought about that, even though I already did that for the ModuleOp and ModuleTerminatorOp. May I ask the reason why the builtin operations are not automatically considered as legal?

Ok I got it. I misunderstood the usage of LLVM::DialectCastOp and this is the reason why I was saying to take it outside the llvm dialect. That is indeed already done with the UnrealizedConversionCastOp, which turned out to be what I needed.
Side note: while searching for the UnrealizedConversionCastOp, I found this page, but unfortunately I never saw it because it isn’t listed in the side menu (actually it is, but it is missing the title and thus almost invisible). It’s not crucial but I think it may be useful to fix that, because I had some initial troubles too in converting ModuleOp and ModuleTerminatorOp (I was wrongly thinking they belonged to the std dialect)

Makes sense, thanks for the explanation. I was missing that part, which has been nicely covered also in Sean’s talk.

Unfortunately not yet, as Sean was saying in the talk, it’s a bit tricky to reduce the problem to a small example. I will try to correct some aspects of my conversion patterns & type converter, and will see if the problem persists.

Thank you for pointing me to the talk, was very clear and clarified me some aspects of the type conversion. I’m now fixing some aspects of my code and I will let you know if I encounter some problems.

The only difference of built-in dialect from other dialects is that it is always loaded in any context. Other than that, we try to have as little special behavior for this dialect as possible: it is just a dialect. Ops belonging to the built-in dialect does not and should not mean that all possible MLIR uses care about them and know how to handle them, in particular they can be illegal in certain cases. For example, built-in func is declared illegal when converting std to llvm because we want it replaced with llvm.func.

Makes sense, thank you.

I also managed to fix some conversions problems by adding source and target materializations, and now I don’t have the mydialect.bool → i1 conversion error anymore.
Anyway I’m still facing a strange recursive index → i64 cast.
I tried to reduce the scenario to the smallest possible one but this is the best I can do, I’ll try to comment it as much as I can so you can understand it without losing too much time:

TestOp: it’s an scf::ForOp-like operation, but don’t focus on its meaning, it’s just an example to show the problem, my real operation is different.
ConditionOp: almost the same as scf::ConditionOp. It’s just a placeholder and it’s removed by the TestOp conversion pattern.
YieldOp: same story of ConditionOp

class TestOp : public mlir::Op<TestOp, mlir::OpTrait::NRegions<2>::Impl, mlir::OpTrait::VariadicOperands, mlir::OpTrait::ZeroResult>
{
	public:
	using Op::Op;

	static llvm::StringRef getOperationName() {
		return "mydialect.test";
	}

	mlir::Region& condition() {
		return getRegion(0);
	}

	mlir::Region& body() {
		return getRegion(1);
	}

	static void build(mlir::OpBuilder& builder, mlir::OperationState& state, mlir::ValueRange args) {
		state.addOperands(args);
		auto insertionPoint = builder.saveInsertionPoint();

		builder.createBlock(state.addRegion(), {}, args.getTypes());
		builder.createBlock(state.addRegion(), {}, args.getTypes());

		builder.restoreInsertionPoint(insertionPoint);
	}
};

class ConditionOp : public mlir::Op<ConditionOp, mlir::OpTrait::ZeroRegion, mlir::OpTrait::VariadicOperands, mlir::OpTrait::ZeroResult, mlir::OpTrait::IsTerminator> {
	// trimmed a bit ...
	static void build(mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, mlir::Value condition, mlir::ValueRange args = {});

	mlir::Value condition();
	mlir::ValueRange args();
};

class YieldOp : public mlir::Op<YieldOp, mlir::OpTrait::ZeroRegion, mlir::OpTrait::VariadicOperands, mlir::OpTrait::ZeroResult, mlir::OpTrait::HasParent<IfOp, ForOp, WhileOp>::Impl, mlir::OpTrait::IsTerminator>
{
	// trimmed a bit ...
	static void build(mlir::OpBuilder& builder, mlir::OperationState& state, mlir::ValueRange args = 

	mlir::ValueRange args();
};

TestOp conversion pattern:

class TestOpLowering : public mlir::OpConversionPattern<TestOp>
{
	public:
	TestOpLowering(mlir::MLIRContext* ctx, TypeConverter& typeConverter)
		: mlir::OpConversionPattern<TestOp>(typeConverter, ctx, 1)
	{
	}

	mlir::LogicalResult matchAndRewrite(TestOp op, llvm::ArrayRef<mlir::Value> operands, mlir::ConversionPatternRewriter& rewriter) const override
	{
		mlir::Location location = op.getLoc();

		// Split the current block
		mlir::Block* currentBlock = rewriter.getInsertionBlock();
		mlir::Block* continuation = rewriter.splitBlock(currentBlock, rewriter.getInsertionPoint());

		// Inline regions
		mlir::Block* conditionBlock = &op.condition().front();
		mlir::Block* bodyBlock = &op.body().front();

		rewriter.inlineRegionBefore(op.body(), continuation);
		rewriter.inlineRegionBefore(op.condition(), bodyBlock);

		// Start the loop by branching to the "condition" region
		rewriter.setInsertionPointToEnd(currentBlock);
		rewriter.create<mlir::BranchOp>(location, conditionBlock, op.getOperands());

		// Replace the "condition" block terminator with a conditional branch
		rewriter.setInsertionPointToEnd(conditionBlock);
		auto conditionOp = mlir::cast<ConditionOp>(conditionBlock->getTerminator());
		rewriter.replaceOpWithNewOp<mlir::CondBranchOp>(conditionOp, conditionOp->getOperand(0), bodyBlock, conditionOp.args(), continuation, llvm::None);

		// Replace "body" block terminator with a branch to the "step" block
		rewriter.setInsertionPointToEnd(bodyBlock);
		auto bodyYieldOp = mlir::cast<YieldOp>(bodyBlock->getTerminator());
		rewriter.replaceOpWithNewOp<mlir::BranchOp>(bodyYieldOp, conditionBlock, bodyYieldOp.getOperands());

		rewriter.eraseOp(op);
		return mlir::success();
	}
};

TestOp creation:

auto loc = builder.getUnknownLoc();
mlir::Value start = builder.create<mlir::ConstantOp>(loc, builder.getIndexAttr(0));
auto testOp = builder.create<TestOp>(loc, start);

// Condition
builder.setInsertionPointToStart(&testOp.condition().front());
builder.create<ConditionOp>(
		location,
		builder.create<mlir::ConstantOp>(loc, builder.getBoolAttr(true)),
		testOp.condition().front().getArgument(0));

// Body
builder.setInsertionPointToStart(&testOp.body().front());
mlir::Value oneValue = builder.create<mlir::ConstantOp>(loc, builder.getIndexAttr(1));
mlir::Value nextValue = builder.create<mlir::AddIOp>(loc, testOp.body().front().getArgument(0), oneValue);
builder.create<YieldOp>(location, nextValue);

builder.setInsertionPointAfter(testOp);

When running the conversion, the “nextValue” conversion from index to i64 leads to this error:

//===-------------------------------------------===//
Legalizing operation : 'mydialect.test'(0x7fffc341f900) {
  * Fold {
  } -> FAILURE : unable to fold

  * Pattern : 'mydialect.test -> ()' {
    ** Insert  : 'std.br'(0x7fffc3448070)
    ** Insert  : 'std.cond_br'(0x7fffc33f5030)
    ** Replace : 'mydialect.condition'(0x7fffc3420ef0)
    ** Insert  : 'std.br'(0x7fffc344bfb0)
    ** Replace : 'mydialect.yield'(0x7fffc34210a0)
    ** Erase   : 'mydialect.test'(0x7fffc341f900)

    // ... trimmed out a bit, in order to show just the failing part ...

    //===-------------------------------------------===//
    Legalizing operation : 'std.br'(0x7fffc344bfb0) {
      "std.br"(%28)[^bb1] : (index) -> ()

      * Fold {
      } -> FAILURE : unable to fold

      * Pattern : 'std.br -> ()' {
        ** Insert  : 'llvm.mlir.cast'(0x7fffc3446828)
        ** Insert  : 'llvm.br'(0x7fffc344e220)
        ** Replace : 'std.br'(0x7fffc344bfb0)

        //===-------------------------------------------===//
        Legalizing operation : 'llvm.mlir.cast'(0x7fffc3446828) {
          %29 = "llvm.mlir.cast"(%28) : (index) -> i64

          * Fold {
          } -> FAILURE : unable to fold

          * Pattern : 'llvm.mlir.cast -> ()' {
            ** Insert  : 'llvm.mlir.cast'(0x7fffc344e2c8)
            ** Replace : 'llvm.mlir.cast'(0x7fffc3446828)

            //===-------------------------------------------===//
            Legalizing operation : 'llvm.mlir.cast'(0x7fffc344e2c8) {
              %29 = "llvm.mlir.cast"(%28) : (index) -> i64

              * Fold {
              } -> FAILURE : unable to fold

              * Pattern : 'llvm.mlir.cast -> ()' {
              } -> FAILURE : pattern was already applied
            } -> FAILURE : no matched legalization pattern
            //===-------------------------------------------===//
          } -> FAILURE : generated operation 'llvm.mlir.cast'(0x00007FFFC344E2C8) was illegal
        } -> FAILURE : pattern failed to match
      } -> FAILURE : no matched legalization pattern
      //===-------------------------------------------===//
    } -> FAILURE : generated operation 'llvm.mlir.cast'(0x00007FFFC3446828) was illegal
  } -> FAILURE : pattern failed to match
} -> FAILURE : no matched legalization pattern
//===-------------------------------------------===//
} -> FAILURE : generated operation 'std.br'(0x00007FFFC344BFB0) was illegal
} -> FAILURE : pattern failed to match
} -> FAILURE : no matched legalization pattern

If I change the YieldOp to make it return the block argument instead of the “nextValue”, the conversion works:

builder.create<YieldOp>(location, testOp.body().front().getArgument(0));

Sorry for long write up and again many thanks for your help.

I’m afraid I need an executable example, the idea of a repro is for me to be able to step through the infra execution and understand what happens. I suppose you can also do that :slight_smile:

It’s not very clear to me why we even attempt to legalize llvm.mlir.cast, is it declared illegal? (There are some situations where the infra would attempt to legalize injected ops, but I need to be sure). My current guess is that the infra, for some reason, attempts to legalize the llvm.mlir.cast inserted in materialization, which triggers further materialization. One way to guard against it is to modify the materialization hook to check where do the operands it is supplied with come from, and if they come from llvm.mlir.cast %0 : A to B make it use %0 directly instead of inserting the reverse cast.

Block arguments are subject to argument materialization, which is configured differently from source/target materialization, so this may explain the difference.

Which the most suitable way to provide you that? Maybe a repo on github?

Yes it is, but if I don’t, I get an unsupported conversion error. Shouldn’t index → i64 be already handled by standard LLVMTypeConverter? My converter extends that one, so I suppose his behaviour should be preserved.

Yes indeed, but “newValue” should not be subject to argument materialization, but to the target one. And my understanding is that this is indeed happing, having the default materialization applied on newValue, but then for some unknown reason the legalization of llvm.mlir.cast further introduce a new cast before just applying its pattern, which should replace the operation itself with its operand.

Yes, I can look at a repo given a commit hash and build+run instructions.

This is indicative of the problem. “unsupported cast” is produced by the verifier llvm-project/LLVMDialect.cpp at 5b3fc7180c8e4a2c57946d5e3cc325744a770717 · llvm/llvm-project · GitHub, meaning that the conversion produced an invalid cast operation. Once you have an invalid operation in the mix, weird things start happening because everything in the infrastructure expects operations to be valid. You need to debug why this invalid cast is created in the first place and avoid it.

I’ve created a very small repo: GitHub - mscuttari/mlir-type-test
Please ignore the fact that the code is all inside header files, this is not for sure production code and my goal was to keep it compact as much as possible.
Build and run instruction are in the readme (it’s just a simple cmake project, nothing special). LLVM and MLIR CMake file paths are to be set, but I don’t think this should be a problem for you :joy:

I gave a look to the LLVMTypeConverter and what I managed to understand is that I don’t need source / target materialization, being my types just a simple 1 to 1 map of the standard ones (for example, mydialect.bool just maps to i1). In fact, there is no source / target materialization for mlir’s FloatType or ComplexType; there is just the convertType method that I have too.
At this point, I can’t no more understand what I should do.

  • llvm.mlir.cast operation, which is inserted by the type converter, sometimes create a recursive conversion if I mark it as illegal, and thus fail. If I mark it as legal the conversion doesn’t fail, but then I can’t convert to LLVM IR because mydialect.bool → i1 or i1 → mydialect.bool llvm.cast operations exist.
  • If I force the source materialization for i1 → mydialect.bool or the counterpart target materialization to use UnrealizedCastOp instead of llvm.cast, then I still have UnrealizedCastOp in my IR and again I can’t convert to LLVM IR
  • If I mark it UnrealizedCastOp as illegal, in order to force a conversion, then there is no conversion pattern

This is definitely driving me crazy.

Okay, so I debugged this a bit and can offer a decent explanation.

The main problem with your code is that it tries to convert everything at once. This is known to be problematic in various ways, especially if you want to convert types from A to B to C in the same call to the infrastructure. It being problematic is the reason why I spent time last month making sure all in-tree conversions to LLVM dialect are truly partial, i.e. they don’t subsume each other and run in separate passes (which uncovered a couple of bugs in the infra, which I fixed). It is also one of the reasons to introduce UnrealizedConversionCast as a built-in known to everybody.

The solution to this problem is to do partial conversions, along the lines of @_sean_silva’s presentation. You can have one pass that converts your dialect to std and built-in types, and then just call the normal std-to-llvm pass. This fixes the code in your example (patch diff.patch · GitHub) and also makes it generally simpler to reason about. This is also consistent with how the conversion to LLVM, and generally multi-stage type conversions, are designed.

It might get a bit more complex if your types are not convertible to built-in but only LLVM directly and you have ops that mix your types and built-in types. In that cast, you can have a partial conversion from your dialect to LLVM and configure appropriate materializations in it to insert casts between the LLVM types you produce and built-in counterparts when necessary. Note that this partial conversion should not try converting standard types at the same time. If you hit this case, I would also encourage you to patch LLVM conversion upstream so it uses UnrealizedConversionCast instead of the custom cast.

The overall flow with casts is supposed to be as follows. Since we don’t want all dialects and all conversions to know about each other, we design partial conversions from type system A to type system B. By partial, we mean that we must insert cast operations (UnrealizedConversionCast) if there are remaining users of the type system A as a result of the conversion. So we can have a partial conversion from dialect A1 to B and from dialect A2 to B, where A1 and A2 both use the type system A, without A1 and A2 knowing about each other. Partial conversions are expected to produce casts that persist, so the cast operation must be declared legal. The last conversion after which we no longer expect anything to use type system A can declare the cast operation illegal so chains of casts get folded away if possible, or the conversion fails otherwise. Alternatively, we can run the canonicalizer pass to fold away noop cast chains.

With that in mind, it is possible to explain the behavior you observe. The conversion infra proceeds as usual and hits the pattern for mydialect.test which introduces some std operations in the process. All new operations are immediately converted further since they are not declared legal, and replaced with llvm operations. As this happens, type mismatches appear in the IR between the std ops that were already present in the IR (constant, addi) and the newly created llvm ops since the already present std operations have not been converted yet. Type mismatches result in source/target materialization, which inserts more new operations, specifically llvm.mlir.cast. These new operations are immediately converted further since they are not declared legal (sounds familiar, isn’t it?). It triggers target materialization because type mismatch is still there, which injects further llvm.mlir.cast, which is immediately converted leading to potentially infinite recursion, caught by the infra. Had llvm.mlir.cast been declared legal, the infra wouldn’t have attempted to legalize it.

Now, we can argue that the infra just shouldn’t call materialization for ops that already result from a materialization (I am mildly against this: it makes the reasoning even more complex and there may be cases that want a sort of finite-recursive materialization supported by appropriately configured patterns). We can also argue that aborting the conversion is undesirable if one of the produced ops was illegal and couldn’t be made legal immediately, but could be made legal further down the line. (In this case, the illegal cast could be removed after the addi is converted). But again, this sounds like adding more complexity, and it is unclear to me how this could combine with other infra invariants.

A partially alternative solution to the problem you are facing is to move the dialect cast pattern to be a folder instead, which is something we should do anyway, still declare the cast op legal and run the canonicalizer after the single conversion pass to clean up noop chains as explained above.

Note that the implementation of this class, in particular the fact that it derives TypeConveter is just legacy that takes time to clean up.

You can use UnrealizedConversionCast for your dialect. If you follow the partial conversion scheme described above, you only need casts if operations on mydialect persist after the conversion, which doesn’t seem to be the case, so you actually never need the cast.

If all ops are ultimately convertible to the LLVM dialect, you will end up with %1 = cast %0 : !llvm.type to !foo followed by %2 = cast %1 : !foo to !llvm.type which can be easily removed by a canonicalizer sweep. Then you can translate to LLVM IR.

Conversion patterns don’t appear from the thin air. If you don’t add them, they are not there.

I find it sufficiently logical although poorly documented.

Very thanks, I’m trying to follow your suggestions and I will report back about the progress.

Just to be sure: for “custom cast” you mean llvm.mlir.cast, right?

This is one thing I was searching but didn’t know where to look precisely. So you are saying that after a pattern has been applied, the infra will search for the subsequent uses of the result values and insert a cast in case of type mismatch. Did I understand correctly? Or are the casts inserted when the usages are later encountered?
To be honest I don’t know if there is any possible difference between the two scenarios but I’m curious about this.

Well, yes :joy:

So the purpose of UnrealizedCastOp is to obly the user to define his own conversion pattern for the UnrealizedCastOp operation? Seems logic to me, being the user the only one who knows the dialect and its types, but a feedback about this would be apreciated.

For sure it is, it was just a feeling that is caused by my limited knowledge of the whole infrastructure.

Yes.

The easiest way is to find any debug message you see in the code with grep, and follow the call stack from there.

Target materialization - llvm-project/DialectConversion.cpp at b0f0115308e4e8692b254c3b0e20f3743616b2d5 · llvm/llvm-project · GitHub
Source materialization 1 - llvm-project/DialectConversion.cpp at b0f0115308e4e8692b254c3b0e20f3743616b2d5 · llvm/llvm-project · GitHub
Source materialization 2 - llvm-project/DialectConversion.cpp at b0f0115308e4e8692b254c3b0e20f3743616b2d5 · llvm/llvm-project · GitHub
Argument materialization - llvm-project/DialectConversion.cpp at b0f0115308e4e8692b254c3b0e20f3743616b2d5 · llvm/llvm-project · GitHub

As you’ll notice, target materialization happens before calling a pattern in case of operand type change (because the pattern expects operands to have correct types). Argument materialization is applied in signature conversion, for a similar reason. Source materialization is called after the end of conversion if there are live users of a value that changed type (two occurrences depending on value being OpResult or BlockArgument).

No, it is only necessary to force type casts opaquely. If you are doing partial conversions of ops operating on a set of types A to ops operating on a set of type B, you can inject casts from A to B and back. After you are done with the conversions, there shouldn’t be any remaining op that uses the types from the set A, other than the casts feeding each other. At that point, these casts can be folded away by the canonicalizer. You can trigger the folding either by running the canonicalizer as a pass or by marking the cast op illegal in the last conversion.

I’ve rewritten my full conversion pass into two two partial ones. The first converts all the operations of mydialect into a mix of std, scf and llvm ones. The type converter used in this pass create the materializations by using UnrealizedConversionCast, and at the of the pass I get to the point you said, in which the uses of mydialect types are only used to feed the casts.
Anyway I’m still encountering a recursive cast behaviour which I managed to reproduce with this small test case. Consider it as if it would be the output of my first conversion pass, and thus includes the UnrealizedConversionCast operations.

// test.mlir

module  {
  func @main() -> () {
    %c0 = constant 0 : i64
    %c0i = unrealized_conversion_cast %c0 : i64 to index
    %c5 = constant 5 : i64
    %c5i = unrealized_conversion_cast %c5 : i64 to index
    %c1 = constant 1 : i64
    %c1i = unrealized_conversion_cast %c0 : i64 to index
    scf.for %arg1 = %c0i to %c5i step %c1i {
        %x = unrealized_conversion_cast %arg1 : index to i64
    }
    return
  }
}

By running the scf → std and std → llvm passes, I get a recursive cast on the arguments of the branches generated by the for’s conversions.

mlir-opt test.mlir -convert-scf-to-std | mlir-opt -convert-std-to-llvm -debug-only=dialect-conversion

Part of the output:

//===-------------------------------------------===//
Legalizing operation : 'std.br'(0x7fffbe9d0e70) {
  "std.br"(%1)[^bb1] : (index) -> ()

  * Fold {
  } -> FAILURE : unable to fold

  * Pattern : 'std.br -> ()' {
    ** Insert  : 'llvm.mlir.cast'(0x7fffbe9f17b8)
    ** Insert  : 'llvm.br'(0x7fffbe9f1890)
    ** Replace : 'std.br'(0x7fffbe9d0e70)

    //===-------------------------------------------===//
    Legalizing operation : 'llvm.mlir.cast'(0x7fffbe9f17b8) {
      %6 = "llvm.mlir.cast"(%1) : (index) -> i64

      * Fold {
      } -> FAILURE : unable to fold

      * Pattern : 'llvm.mlir.cast -> ()' {
        ** Insert  : 'llvm.mlir.cast'(0x7fffbe9f19c8)
        ** Replace : 'llvm.mlir.cast'(0x7fffbe9f17b8)

        //===-------------------------------------------===//
        Legalizing operation : 'llvm.mlir.cast'(0x7fffbe9f19c8) {
          %6 = "llvm.mlir.cast"(%1) : (index) -> i64

          * Fold {
          } -> FAILURE : unable to fold

          * Pattern : 'llvm.mlir.cast -> ()' {
          } -> FAILURE : pattern was already applied
        } -> FAILURE : no matched legalization pattern
        //===-------------------------------------------===//
      } -> FAILURE : generated operation 'llvm.mlir.cast'(0x00007FFFBE9F19C8) was illegal
    } -> FAILURE : pattern failed to match
  } -> FAILURE : no matched legalization pattern
  //===-------------------------------------------===//
} -> FAILURE : generated operation 'llvm.mlir.cast'(0x00007FFFBE9F17B8) was illegal
} -> FAILURE : pattern failed to match
} -> FAILURE : no matched legalization pattern

P.S.:

The type converter I use in my first pass derives from LLVMTypeConverter, because I need to create LLVM operations that operates on the converted index type. Inside the converter, I set the materializations to always generate UnrealizedConversionCast instead of llvm.mlir.cast. Is this ok?

EDIT: I managed to get a first multi-partial conversion working. What I’m wtrigin is not directly related (at least, I think so) to what is written right above into this post, but one thing I didn’t take care of was the fact that the block arguments of a function are converted when the pattern is applied to the function itself. This is nothing strange but I didn’t take that into account, and I did put the std → llvm conversion patterns in the first stage conversion; thus I was converting my std.function to llvm.function before converting the SCF operations. Moving the std → llvm conversion after the scf → std one made the block arguments, generated by the scf operations, to get converted

No, it’s not. None of the downstream passes know how to handle UnrealizedConvresionCast so the std-to-llvm conversion will ultimately fail; it’s a full conversion and the op is neither legal nor handled in it. It does fail earlier because of target materialization, but even if it had not, it would have failed later.

I replaced unrealized_conversion_cast with llvm.mlir.cast in your snippet, and mlir-opt -convert-scf-to-std -convert-std-to-llvm worked for me just fine. This can be achieved either by having materialization insert llvm casts between std and llvm types and unrealized casts otherwise or by having a separate pass after your conversion that replaces UnrealizedConversionCasts with LLVM::DialectCast when possible.

The future-proof solution is to port the various *-to-llvm conversions to use UnrealizedConversionCasts. I don’t have time for that now, but I welcome and can review patches in that direction.