No fusion when there is AffineIfOp

I see this test for affine-loop-fusion here:

func @should_not_fuse_if_inst_at_top_level() {
  %m = memref.alloc() : memref<10xf32>
  %cf7 = constant 7.0 : f32

  affine.for %i0 = 0 to 10 {
    affine.store %cf7, %m[%i0] : memref<10xf32>
  }
  affine.for %i1 = 0 to 10 {
    %v0 = affine.load %m[%i1] : memref<10xf32>
  }
  %c0 = constant 4 : index
  affine.if #set0(%c0) {
  }
  // Top-level IfOp should prevent fusion.
  // CHECK:      affine.for %{{.*}} = 0 to 10 {
  // CHECK-NEXT:   affine.store %{{.*}}, %{{.*}}[%{{.*}}] : memref<10xf32>
  // CHECK-NEXT: }
  // CHECK:      affine.for %{{.*}} = 0 to 10 {
  // CHECK-NEXT:   affine.load %{{.*}}[%{{.*}}] : memref<10xf32>
  // CHECK-NEXT: }
  return
}

I think the affine.if is not related to the first two affine.for, so the first two affine.for should be fused.

Another example is when there is affine.if inside a affine.for near the end of a program, it causes all the other affine.for unfused.

#set = affine_set<(d0) : (d0 - 1 >= 0)>
func @test_fusion() {
  %a = memref.alloc() : memref<10x10xf32>
  %b = memref.alloc() : memref<10x10xf32>
  %cf7 = constant 7.0 : f32

  affine.for %i0 = 0 to 10 {
    affine.for %i1 = 0 to 10 {
      affine.store %cf7, %a[%i0, %i1] : memref<10x10xf32>
    }
  }
  affine.for %i2 = 0 to 10 {
    affine.for %i3 = 0 to 10 {
      %v0 = affine.load %a[%i3, %i2] : memref<10x10xf32>
      affine.store %v0, %b[%i2, %i3] : memref<10x10xf32>
    }
  }
  affine.for %i4 = 0 to 10 {
    affine.for %i5 = 0 to 10 {
      affine.if #set(%i5) {
        %v1 = affine.load %b[%i4, %i5] : memref<10x10xf32>
      } else {}
    }
  }
  return
}

Is it the expected behavior or a bug we should fix? For me, it is like a bug.

This looks like extremely conservative behavior that should be addressed now.

@bondhugula I found the code that does this conservative behavior. Will create a patch for this soon.

1 Like