Correct. A linalg.generic
with a broadcasting affine map and n
ops would get raised as linalg.broadcast
+ linalg.op[n]
, which it should be then lowered to the same generic back again. But it won’t. It’ll be lowered to n+1
generics.
This is the main reason why we did not follow this path. Finding a canonical representation of bcast
+ add
is not hard, but the general case might be intractable, especially as we start having non-perfectly-nested loops with named ops inside.
This is achieved by round-trip tests. I imagine they must be identical in your cases (1:1 mapping). They may not be the case for some more complex patterns.
I postulate that it could become identical after a number of carefully selected series of conversion + canonicalization (which could run fusion of the broadcast into the element wise, for example).
The hard part is to find such a sequence of transforms that always yields a fixed point after N iterations for the general case.