Heuristic in FusionOnTensor.cpp

I see FusionOnTensor.cpp has the following heuristic in place. It seems only picking one LinalgOp to tile and fuse in a function. Was there some specific reason to pick this heuristic? Thanks.

    // Heuristic to find a good operation to tile and start fusion. Walk all
    // operations and select the one with the maximal backward slice of fusion
    // candidates.
    LinalgOp rootOp = nullptr;
    int64_t numFusionCandidates = -1;
    funcOp.walk([&](LinalgOp linalgOp) {
      SetVector<Operation *> backwardSlice;
      getBackwardSlice(linalgOp, &backwardSlice);
      int64_t backwardSliceSize = count_if(
          backwardSlice, [](Operation *op) { return isa<LinalgOp>(op); });
      if (backwardSliceSize > numFusionCandidates) {
        rootOp = linalgOp;
        numFusionCandidates = backwardSliceSize;
    if (!rootOp)
      return notifyFailure("expect to find a root operation");

Hi @ruizhang,

The LinalgTileAndFuseTensorOps pass is currently indeed just searching for the operation with the most fusion potential and fuses its direct producers. ⚙ D110262 [mlir][linalg] Add support for transitive fusion. at some point will add support for transitive fusion. We went for this simple heuristic since the LinalgTileAndFuseTensorOps pass is mostly used for testing and the heuristic is a good fit there. It may also work for small programs where fusing everything is an option.

We are also working on making fusion available from CodegenStrategy. However, it will take some more time to make it flexible enough for more complex optimization scenarios.

If you have a specific heuristic in mind for you use case, you may want to write your own pass using tileConsumerAndFuseProducers. The method implements the bulk of the fusion logic.

Hope this helps!

1 Like

Hi @gysit, thanks for the information! I asked a question related to tiling in another thread too Difference between --linalg-tile and --linalg-tile-and-fuse-tensor-ops - MLIR - LLVM Discussion Forums. Could you please help take a look at that question as well? Thanks.