Plans to add x86-64-v5 target?

I’ve been using -march=x86-64-v{2-4} for a project to get code generated roughly according to the SSE/AVX2/AVX512 developments in x86 CPUs. However, -v4 is a bit “behind” now, as it was introduced to cover the AVX512 set of Skylake CPUs. We are now at least 3 to 4 Intel CPU generations down the line, and there are quite a few new extensions since then, as shown in this image. Comparing -v4 to znver4 (latest AMD CPUs), there is quite a difference. So to target the latest Intel/AMD CPUs, I think it would make sense to add -v5 that covers Zen4 and icelake-server (they seem to have the same set of extensions).

Does this make sense? If yes, are my suggested targets good ones or does it make more sense to use different attributes?

EDIT: I think this probably also makes sense as an extra target before the entire AVX10 stuff becomes mainstream. Then there is a “complete” AVX512 target and then maybe -v6 starts with AVX10.

1 Like

I’m not involved in defining these levels, but the announcement of AVX10 significantly complicates these microarch level definitions. When the levels were initially defined, it was assumed that new CPU generations would always support the instructions of previous CPU generations. So level N would enable a superset of the features of level N-1, and software built for level N-1 would run on CPUs supporting level N.

However, the baseline for Intel AVX10 CPUs only supports 256-bit vector registers (but with effectively all of the AVX512 instructions). Only high-end server CPUs will continue to provide 512-bit vector registers. So, there no real way to have a linear progression from x86-64-v4, anymore.

Someone needs to decide how to approach this problem, in order to decide what new levels should be defined. Adding a new level v5 which is a “dead-end” may not make sense. Potentially such decision needs to wait to see what sorts of hardware actually ends up shipped and used by end-users – maybe it’ll turn out that most CPUs will support 512-bit vector registers after-all, and AVX10-256 is effectively abandoned. I’ve no idea.

1 Like

Yes, how to perfectly arrange AVX10 256-bit to x86-64-vN is a headache problem. We have some internal discussions and haven’t made any consensus. Suggestions are welcome!

Okay, if it is not clear how/if the -v* series will be continued at all, then this probably does not make sense at this point in time. Thanks for your comments.

Would it make sense to either redefine x86-64-v4 as using 256-bit registers only or a new target x86-64-v4-256?

Either could allow the stepping stone of x86-64-v5 (either being 256-bit or accompanied by x86-64-v5-256) for zen 4/icelake server level features and something like x86-64-v6 could eventually introduce AVX10/256 and AVX10/512 support.

Conversely, we could redefine x86-64-v4 to be 256-bit registers only and accompany it with x86-64-v4-512 and this allows nice superset of features (albeit it would require creating patches for old compiler versions to either be XMM and YMM for x86-64-v4 or adding a warning that v4 may not be compatible with all future CPUs therefore v3 is recommended or use newer compiler version)