This is an RFC for a proposed target specific X86 optimization for reducing code size in the encoding of AVX-512 instructions when possible.
When the AVX512F instruction set was introduced in X86 it included additional 32 registers of 512bit size each ZMM0 - ZMM31, as well as additional 16 XMM registers XMM16-XMM31 and 16 YMM registers YMM16-YMM31.
In order to encode the new registers of 16-31 and the additional instructions, a new encoding prefix called EVEX, which extends the existing VEX encoding, was introduced as shown below:
The EVEX encoding format:
EVEX Opcode ModR/M [SIB] [Disp32] / [Disp8*N] [Immediate]
The existing VEX encoding format:
[VEX] OPCODE ModR/M [SIB] [DISP] [IMM]
Note that the EVEX prefix requires 4 bytes whereas the VEX prefix can take only up to 3 bytes.
Consequently, for the SKX architecture, many instructions that use only the lower registers of XMM0-XMM15 or YMM0-YMM15, can be encoded by either the EVEX or the VEX format. For such cases, using the VEX encoding results in a code size reduction of ~2 bytes even though it is compiled with the AVX512F/AVX512VL features enabled.
For example: “vmovss %xmm0, 32(%rsp,%rax,4)“, has the following 2 possible encodings:
EVEX encoding (8 bytes long):
62 f1 7e 08 11 44 84 08 vmovss %xmm0, 32(%rsp,%rax,4)
VEX encoding (6 bytes long):
c5 fa 11 44 84 20 vmovss %xmm0, 32(%rsp,%rax,4)
See reported Bugzilla bugs about this proposed optimization:
The proposed optimization implementation is to add a table of all EVEX opcodes that can be encoded via VEX in a new header file placed under lib/Target/X86.
A new pass is to be added at the pre-emit stage.
No need for special Opt flags, as it is always better to use the reduced VEX encoding when possible.
Thank you for any comments or questions that you may have.