Conversions with Multiple Target Dialects

The directory structure of lowerings/conversions in MLIR suggests that a dialect is always lowered to one lower dialect at a time. If a dialect (A) is lowered to multiple lower dialects (B and C), the patterns appear to be separated in two passes (A to B and A to C). However, this is not always possible, e.g. the SCF dialect is not really useful by itself. Any conversion to SCF is likely also a conversion to some other dialect.

In our concrete case, we lower the shape dialect to standard and SCF dialects. Some patterns are lowered to standard operations only while others require a mixture of standard and SCF operations.

I think, at least the vector dialect is in a very similar situation here as both, the VectorToSCF and the ShapeToSCF lowerings, also lower to standard operations.

  • Is this a general problem or is it specific to the shape and vector dialect?
  • Do we want to allow for conversions with multiple target dialects?
    (e.g. ShapeToSCF and ShapeToStandard could be merged into ShapeToSCFAndStandard)
  • Is the standard dialect always implicitly allowed?
    (VectorToSCF and ShapeToSCF are two examples that currently assume this)
1 Like

There are several other cases with more than one dialect involved on either side of the conversion: “-lower-affine” goes from affine to a mix of scf and std, there are several dialects that lower to the LLVM dialect together, e.g. “-convert-linalg-to-llvm” subsumes “-convert-std-to-llvm”, etc.

The rationale for lib/Conversion/AToB was that the conversion patterns should depend on both A and B so it does not make sense to have them in either of A and B dialects and thus create a dependency between them. By extension, it makes sense to be to have AToBCD for patterns that produce a mix of operations from B,C and D dialects.

1 Like

Another thing that we stumbled across is a use case for the combination of multiple source dialects. In the concrete case, we have a bug that appears only when lowering SCF to standard and standard to LLVM simultaneously. With the conversion passes SCFToStandard and StandardToLLVM this bug is not reproducible and hence it is difficult to write a test for the fix (see https://reviews.llvm.org/D85634).

Adding a conversion pass SCFandStandardToLLVM that does nothing more than compiling the already existing patterns would help. Such a pass might also be useful elsewhere.

Is this desired? An alternative would be the inclusion of such a pass solely for testing purposes. In that case, it could live under StandardToLLVM.

Personally, in this case, I view the use of SCF as merely a device as part of lowering shape to std. The fact that the lowering is factored nicely enough that it can possibly be reused if SCF is being converted to non-std control flow is a convenient feature for power uses, but not enough to deserve a directory structure.

I would personally advocate for putting everything under ShapeToStandard, and exposing a convert-shape-to-std pass.

Also, I personally am not a huge fan of large conversion passes that aggregate multiple logically separate conversions into the same pass. So I would further advocate for exposing a “shape-to-std-pipeline” pass pipeline, consisting of:

  • shape to shape in-dialect lowering pass
  • lowering shape ops to std + scf pass
  • lowering scf to std pass

And then run lowering to LLVM as a separate pass. This is much easier to debug IMO, and would have avoided this issue in the first place (though I haven’t looked at the details in your patch; it might still be an independently useful fix).

Why do you need to lower simultaneously?

This sounds like some intricate ordering problem in the pattern infrastructure. Maybe we should consider improving the infrastructure. Note that @nicolasvasilache and myself have run into situations where we want to control the order in which patterns are applied, so it may be common enough to think about some support.

I would say the exact opposite of this is desired. Having DialectAToBAndCAndD is justified if we have a pattern that rewrites one op from dialect A into ops from B,C,D simultaneously. So if we don’t do this rewrite in a single shot, we will have to introduce abstractions in B,C,D that make sure we can can decompose the conversion, and that by itself isn’t a sufficient justification for increasing the number of abstractions in a given dialect. On the other hand, DialectAAndBAndCToD is an assembly of patterns where each pattern only knows about the two dialects it can handle (unless there is an inst-combine pattern that converts a DAG of ops from A,B,C into a single op from D, but I’m not aware of such cases). These patterns can and, IMO, should remain independent. Otherwise, assumptions about other dialects being present will creep in and we will end up with a combinatorial number of DialectAAndBAndMaybeSomeCToDAndEAndSomeFButNotAllOpsFromF.

Now, LLVM dialect is a bit special because it uses a different type system that the dialects that convert into it, and furthermore this system is closed. This has led to us having std-to-llvm, vector-to-llvm that subsumes std-to-llvm, and linalg-to-llvm that subsumes std-to-llvm, vector-to-llvm, linalg-to-scf and scf-to-std. This is partly due to the original vision of the pattern rewriting infrastructure that existed before most of the infrastructure was implemented: we would have a huge bag of patterns and make the infrastructure do the conversion in a single shot. Since then, we realized that progressive lowering is a desirable feature because it makes it easier to reuse and debug individual patterns, and the infrastructure is slowly evolving into that direction. Linalg-to-LLVM exists because DialectCastOp did not exist at the point, so we had to convert absolutely all ops all the way down to the LLVM dialect. (For the record, we originally discussed having a cast op, but it sounded as a magic cast-any-type-to-any-other-type that did not have semantics, so some people including myself were strongly opposed to that. Dialect-specific ops are a consensus). We should split linalg-to-llvm apart, make sure it lowers only linalg ops and inserts appropriate casts and just call the rest of the passes as usual.

If it’s for testing purposes only, it can live under test/lib/ and can depend on anything. However, I will be opposed to making StandardToLLVM depend on the SCF dialect, especially if the dependency is only necessary for testing purposes.

+1 here. I’d go for a pipeline with multiple passes. Shape-to-Shape can live under lib/Dialect/Shape/Transforms, Shape to std+scf can live under lib/Conversion/ShapeToSCFAndStd. The rest should already exist and be applicable as is.

Now that isn’t what is happening here: these are logically related.

And as you mentioned this is really a shape to std conversion which just happens to use SCF but SCF is not the target, it is not what is being converted to.

-1 from me: this limits usability of conversion targets and dialect lowering. This adds explicit ordering and forces all ops through same pipeline. E.g., what if my target wants an op from dialect B, but now I have to avoid passes from B to C to change that op and also means passes from B to B needs to consider this too (so now at the very least you are threading through conversion targets if using the dialect conversion framework). Vs simply having these patterns live separately and having all the populate methods invoked from a single pass which uses a conversion target.

As general organizing method, having the passes all separate reduces their usability and generality for me.

I don’t think it is required, only observed when.

I don’t think that is special, this is true in various degree of others (e.g.,.ngraph has own tensor type, sure it reuses i32 while llvm one may be more extreme and not use any, a ngraph tensor with ngraph dialect type is no different).

Spirv parts may even be better comparison as both are more on export side and restricts ops and types allowed to match export target.

I don’t see how this limits usability of those things. It does not modify or even use them. If something, it provides evidence that the usability of conversion targets and dialect lowering can be improved.

I can guess you meant that such approach becomes poorly composable with other potential uses of the conversion infra. It would have been true if the approach hadn’t been pattern-based. The discussion here is how to organize those patterns in conversion passes, each of which may be calling one or several calls to the dialect conversion infra. Absolutely nothing prevents one from exposing individual patterns and using them if the need arises.

You lost me here.

FWIW, conversion targets work both ways. You can say “I want only ops from X” as well as “I want any ops except those from X”.

It’s the classical size/complexity trade-off (also known as spaghetti code vs ravioli code). Having one pass that does absolutely everything and has a config with hundreds of options isn’t very usable.

I never said it was the only special case. However, most pattern users don’t require type changes. Convert-X-to-LLVM use case has driven a large part of the type-conversion work so far, although it was not the only one.

I don’t quite get what you mean? While I agree that ideally the conversion framework should support mix-and-match like what is attempted here, that does not mean it is desirable to jump multiple levels of lowering together in the same conversions.
In particular the “pipeline” approach with explicit steps exposes clear “checkpoint” that allow debugging with --print-ir-after-all.

I definitely support always exposing the populate*Patterns function for power users that are truly doing something exceptional where the individual passes are too coarse granularity.

I just think that we should try to expose individual logical passes organized into common pipelines as well because we already need the boilerplate for the passes anyway for testing (and the things that make sense to test separately often is a good indicator of things that we want separate in -print-ir-before-all output). They also raise the abstraction level for users who want the typical case (perhaps one could argue that there is no “typical case” though).

Wow, this discussion has evolved far beyond my problem. Thank you for your thoughts on this.

I never wanted to lower Shape to SCF to standard in one conversion. The goal was always to convert Shape to a mixture of Standard and SCF. There are, however, some patterns that need no SCF at all and I was wondering if that is good enough reason to separate the passes ShapeToSCF(andStd) and ShapeToStd.

I guess the outcome of this discussion is that it would be useful to merge the two conversions .../Conversion/ShapeToStandard and .../conversion/ShapeToSCF because they are not really useful individually. As for the name, ShapeToStd would do (with SCF as an auxiliary dialect) but we could make it more explicit by calling it ShapeToStdAndSCF.

As for building pipelines, that can always happen on top if one encounters a common combination of passes. I do not really see the need here as the tests expect a mix of SCF and Std.

Like @jpienaar expected, there is no need to do anything simultaneously here. We are happy with multiple subsequent passes, it was just the situation in which we encountered the unexpected conversion failure.

btw, some of the ShapeToStd patterns just look like unrolled versions of the ShapeToSCF patterns. I wonder if we could remove the duplication (by e.g. invoking an unroll utility at the end of the pattern if applicable)?

@_sean_silva, which ones do you mean? Btw., here is the CL with the merged passes.

I think you caught them in the CL. Thanks for the cleanup!