Vector Dialect Ops for Intel Intrinsics - Optimizing linalg.copy

Please also have a look at some earlier docs on the vector dialect where we talk about keeping the vector dialect as the proper bridge between the “virtual vector level” and the “hardware vector level”, using progressive lowering. As stated above, we want to avoid that the vector dialect itself becomes too close to the hardware too soon.