I’m working on some experiments and I’d like to split the tiling in two separate ops. Basically, I want to apply one transform between the tiling of the first and the second loop. For example, let’s say we have:

```
%1, %2 = transform.structured.tile %0[32, 16] : (!pdl.operation) -> (!pdl.operation, !pdl.operation)
```

and I want to split it in two, something like:

```
%1, %2 = transform.structured.tile %0[32] : (!pdl.operation) -> (!pdl.operation, !pdl.operation)
// the transform I want to apply would be here
%3, %4 = transform.structured.tile %1[16] : (!pdl.operation) -> (!pdl.operation, !pdl.operation)
```

However, the output I get is different, which means that tiling the tiled loop is not equivalent to tiling once using two tile sizes. However, I don’t understand why is not equivalent, since conceptually it seems like it should be. Is there any way to split the TileOp in two and still get the same, equivalent code?

**Complete Example**

For the sake of completeness, here is a complete example using one TileOp:

```
#map0 = affine_map<(d0, d1) -> (d0, d1)>
module {
func.func @gemm(%arg0: tensor<?x?xf32>, %arg1: tensor<?x?xf32>, %init: tensor<?x?xf32>) -> tensor<?x?xf32> {
%0 = linalg.generic {indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]} ins(%arg0, %arg1 : tensor<?x?xf32>, tensor<?x?xf32>) outs(%init : tensor<?x?xf32>) {
^bb0(%arg6 : f32, %arg7 : f32, %arg8 : f32):
%1 = arith.mulf %arg6, %arg7 : f32
linalg.yield %1 : f32
} -> tensor<?x?xf32>
return %0 : tensor<?x?xf32>
}
transform.sequence failures(propagate) {
^bb0(%arg0: !pdl.operation):
%0 = transform.structured.match ops{["linalg.generic"]} attributes {iterator_types = [#linalg.iterator_type<parallel>, #linalg.iterator_type<parallel>]} in %arg0 : (!pdl.operation) -> !pdl.operation
%tiled_linalg_op, %loops:2 = transform.structured.tile %0[32, 16] : (!pdl.operation) -> (!pdl.operation, !pdl.operation, !pdl.operation)
}
}
```

and using two separate ops:

```
#map0 = affine_map<(d0, d1) -> (d0, d1)>
module {
func.func @gemm(%arg0: tensor<?x?xf32>, %arg1: tensor<?x?xf32>, %init: tensor<?x?xf32>) -> tensor<?x?xf32> {
%0 = linalg.generic {indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]} ins(%arg0, %arg1 : tensor<?x?xf32>, tensor<?x?xf32>) outs(%init : tensor<?x?xf32>) {
^bb0(%arg6 : f32, %arg7 : f32, %arg8 : f32):
%1 = arith.mulf %arg6, %arg7 : f32
linalg.yield %1 : f32
} -> tensor<?x?xf32>
return %0 : tensor<?x?xf32>
}
transform.sequence failures(propagate) {
^bb0(%arg0: !pdl.operation):
%0 = transform.structured.match ops{["linalg.generic"]} attributes {iterator_types = [#linalg.iterator_type<parallel>, #linalg.iterator_type<parallel>]} in %arg0 : (!pdl.operation) -> !pdl.operation
%tiled_linalg_op, %loops = transform.structured.tile %0[32] : (!pdl.operation) -> (!pdl.operation, !pdl.operation)
%tiled_linalg_op2, %loops2 = transform.structured.tile %tiled_linalg_op[16] : (!pdl.operation) -> (!pdl.operation, !pdl.operation)
}
}
```