A bit of background: When we designed One-Shot Bufferize, we had two design options:

- Analyze tensor IR and insert buffer copies only when needed.
- Insert copies on every write (without analyzing anything). Then run a memref analysis to remove copies again.

We went with the first option. I have no good answer which design is better. My gut feeling says variant 1 is simpler because we can utilize SSA use-def chains for the analysis and implement special rules to bufferize certain tensor ops efficiently in the absence of difficult analyses (e.g., range analyses). Also, a tensor-based analysis fits better with destination-style, which we’ve already been utilizing in other components (e.g., tiling).

The bufferization analysis is driven by `BufferizableOpInterface`

. There are two methods that model the flow of data through the program: `getAliasingOpOperands`

and `getAliasingOpResults`

. These are properties of `tensor`

operands.

E.g.:

```
// getAliasingOpOperands(%r) = {%t}
// getAliasingOpResults(%t) = {%r}
%r = tensor.insert %cst into %t[%idx] : tensor<?xf32>
```

This tells us that if `%t`

bufferizes in-place, `buffer(%r) == buffer(%t)`

. The bufferization analysis maintains alias sets based on this information.

Operations that are not tensor-based do not implement the `BufferizableOpInterface`

, so there’s no `getAliasingOpOperands/Results`

property that could be queried. Instead, we have `bufferization.to_memref`

at the boundary, for which the `BufferizableOpInterface`

could be queried.

E.g.:

```
%m = bufferization.to_memref %t : memref<?xf32>
// Do something with %m
```

Our analysis stops at `bufferization.to_memref`

. We don’t know what’s happening to `%m`

. In particular, we don’t know if some op is going to read from `%m`

and/or write to `%m`

. So we have to be conservative and assume that the answer is “yes”; which could potentially insert unnecessary buffer copies.

Then there’s `bufferization.to_tensor`

on the other end.

E.g.:

```
// ...
%t = bufferization.to_tensor %m : memref<?xf32>
```

Our analysis does not know where `%t`

is coming from. But it has to implement `BufferizableOpInterface::getAliasingOpOperands`

. Usually, we would look up the alias set of the OpOperand and maybe union the set of `%t`

and `%m`

. But that doesn’t work because `%m`

is a memref. So we have to be conservative and assume that `buffer(%t)`

may after bufferization alias with any other SSA value who’s definition dominates the `bufferization.to_tensor`

op.

(We don’t do this at the moment. Instead, we assert that there’s no `to_tensor`

/`to_memref`

in the program.)