SGTM. I’ve always felt this was weird. And it seems to be a totally systematic extension of the current behavior.
On the topic of canonicalizing vector<T>
to T
, it feels wrong to me. At the hardware level, changing vector<T>
to T
can result in an expensive scalar readback operation (analogous to changing tensor<T>
to T
, but at a different hierarchy level). I suspect it should be done based on a cost model similar to what we do (or plan to do) for Detensorize.