Intel Advanced Performance Extensions:
The main features of Intel® APX include:
- 16 additional general-purpose registers (GPRs) R16–R31, also referred to as Extended GPRs (EGPRs) in this document;
- Three-operand instruction formats with a new data destination (NDD) register for many integer instructions;
- Conditional ISA improvements: New conditional load, store and compare instructions, combined with an option for the compiler to suppress the status flags writes of common instructions;
- Optimized register state save/restore operations;
- A new 64-bit absolute direct jump instruction.
We will focus on the sub-features EGPR and NDD in this thread.
EGPR
Not all X86 instructions are extended for EGPR. The following is an overview of which instructions are extended and how we are going to implement them.
• Legacy space:
All instructions in legacy maps 0 and 1 that have explicit GPR or memory operands can use the REX2 prefix to access the EGPR, except XSAVE*/XRSTOR.
• EVEX space:
All instructions in the EVEX space can access the EGPR in their register/memory operands.
For the above instructions, we don’t add new entries in TD, and instead we extend GPR with R16-R31 and make them allocatable only when the feature EGPR is available, just like what we did when introducing R8-R15.
Besides, some instructions in legacy space with map 2/3 and VEX space are promoted into EVEX space. Opcode and opcode map may change after the promotion. For these instructions, we add new entries in TD to avoid overcomplicating the assembler and disassembler.
For those instructions that cannot access EGPR, we introduce new register classes GR8/16/32/64_NOREX2. We do not update the register class for each entry of instructions b/c it would affect some optimization passes like machine instruction schedule, whose analysis relies on the static type of operands in TD. Instead, we leverage the target hook TargetInstrInfo:getRegClass to distinguish the instructions by the rules mentioned above.
The constraints of asm operands keep the same meaning as before, e.g. R16-R31 are not allocated when ‘q’,‘r’,‘l’ constraint is used.
All EGPRs are caller-saved registers, and we will add some new kinds of relocations and relocation optimization for them. See discussion at
https://groups.google.com/g/x86-64-abi/c/KbzaNHRB6QU
https://groups.google.com/g/x86-64-abi/c/Gy0RmoP2LnE
https://groups.google.com/g/x86-64-abi/c/saQyqBeL5XE
The support for EGPR in LLDB is almost on hold b/c we haven’t investigated it. Only the mapping to dwarf registers is added in TD file. We would appreciate if someone is knowledge in this field and volunteer to implement it.
NDD
APX extends some instructions with a new form that has an extra register operand called a new
data destination (NDD). In such forms, NDD is the new destination register receiving the result of the
computation and all other operands (including the original destination operand) become read-only source operands.
Compared to legacy instructions, NDD is more friendly to register allocation. We support them similarly as what we did for EVEX promotion for YMM16-YMM31, namely preferring to select NDD version than the legacy one during instruction selection, and compress it to legacy instruction after register allocation if possible. We reuse the EvexToVexInstPass pass to do the compression and rename it to CompressEvexInstPass b/c legacy instruction is not in VEX space.