In the vectorization phase, we often select specific VF by instruction cost, so it’s not obvious which VF we choose in the final assembly code, especially in the case of template functions for SVE (because they unified use register z registers in AArch64), and the source code line numbers are also duplicated in different versions of specialization function.
For example, the following 2 functions come from the same template function eval, have mixed types float and double in the kernel loop body, it is difficult to infer the current VF value directly from assembly code without carefully examining the functionality of the current assembly.
eval<0,0,1,float,double>(...)
eval<0,1,1,float,double>(...)
So is it appropriate to add comment to the final output assembly code indicating the time VF value selected for the current loop?