[RFC] A New "One-Shot" Dialect Conversion Driver

Now that the 1:N dialect conversion refactoring is done, I am continuing the work on this RFC.

Outline of Next Steps

  1. Remove rollback functionality from the dialect conversion driver. This is a breaking API change.
    1.1. Update all conversion patterns in MLIR / Flang / … that currently trigger rollbacks. These are mostly patterns that start modifying the IR and then return failure(). Most patterns have already been updated over the last weeks, but a few remain, mostly in the SPIRV dialect.
    1.2. Add a new allowPatternRollback flag to ConversionConfig that existing users of the dialect conversion framework can use to find patterns that trigger a rollback. These patterns must be updated.
    1.3. After some time, when all downstream users had enough time to migrate their patterns, delete all rollback-related code.
  2. Internal refactoring: materialize all IR changes immediately. This step is mostly NFC from a user’s perspective. I say “mostly” because some things change: e.g., an operation can no longer be removed when it still has uses. (This is allowed today, as long as all users are guaranteed to be removed by the end of the conversion.)

I expect Step 1 to improve the robustness of dialect conversion framework. The rollback mechanism is error prone and many bugs/crashes/… in the past were rollback related. I am quite certain that there are more bugs in the rollback mechanism that are either unknown or that we are working around today.

I expect Step 2 to simplify the code base and improve the compile-time performance. Internal data structures such as ConversionValueMapping and IRRewrite (including subclasses) can be deleted. No additional bookkeeping is needed to keep track of changes that have only partly materialized. Step 2 will also enable use cases that are currently not safely supported such as mixing conversion patterns and rewrite patterns.

Status of Step 1.1:

Failed Tests (14):
  MLIR :: Conversion/ArithToSPIRV/arith-to-spirv.mlir
  MLIR :: Conversion/ArithToSPIRV/fast-math.mlir
  MLIR :: Conversion/ConvertToSPIRV/vector.mlir
  MLIR :: Conversion/MathToSPIRV/math-to-gl-spirv.mlir
  MLIR :: Conversion/SCFToGPU/parallel_loop.mlir
  MLIR :: Conversion/TosaToLinalg/tosa-to-linalg-invalid.mlir             (fixed by #136308)
  MLIR :: Conversion/VectorToSPIRV/vector-to-spirv.mlir
  MLIR :: Dialect/SPIRV/IR/target-env.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/unpack-dynamic-inner-tile.mlir   (unrelated ASAN error)
  MLIR :: Integration/Dialect/SparseTensor/CPU/reshape_dot.mlir           (unrelated ASAN error)
  MLIR :: Transforms/test-legalize-type-conversion.mlir                   (testing rollback)
  MLIR :: Transforms/test-legalizer-full.mlir                             (testing rollback)
  MLIR :: Transforms/test-legalizer.mlir                                  (testing rollback)
  MLIR :: Transforms/test-merge-blocks.mlir                               (testing rollback)

Most patterns have already been updated. (The changes were quite simple.) I have not looked at Flang and other projects yet. The patterns above are the remaining ones that are failing when rollback is disallowed. There are likely additional patterns with bad test coverage that trigger a rollback in certain cases and are not listed above.

While I am working on updating the last SPIRV and GPU-related patterns, I’d like to ask people to run their test suite with this PR to get a feeling of how many patterns must be updated.

Please let me know if you have any concerns.

3 Likes