First question – is this the best way to write the above computation? I see that the affine dialect’s if operation is for integer set containment, so it seems to me like the scf.if is the only other option.
Given this implementation, I’m not able to fuse this loop with an scf.if into other affine loops that produce the input arrays in1 or in2 with the affine loop fusion pass. Before I decide to spend time trying to add support for this into the pass, I wanted to ask if this a fundamental limitation of the affine loop fusion pass, since it seems to require reasoning about dialects other than affine? I could see that as being an argument, but on the other hand, the way that the scf.if is being used is easy to reason about: centered accesses, affine accesses in either branch etc.
When applying the affine-loop-fusion pass on this example, even the first two for loops don’t get fused. However, when I remove the if from the final loop, then the pass succeeds in fusing all 3 loops together.