Purpose of various register classes in X86 target

Hello everyone,

I noticed that there are several register classes defined in X86 target and many of them are overlapping. Is there a list of all X86 register classes documented somewhere? I found many listed in X86GenRegisterInfo.inc(generated by tablegen) but unsure if that is the complete list. Also, is there documentation on the role and purpose of these classes and how the X86 backend decides which class to choose when generating machine code? The comments in X86RegisterInfo.td didn’t help much to fully understand and I wasn’t sure where to start reading the RegisterAllocator source code to understand how it uses the register classes.

It's not that hard in principle:
- A register class is a set of registers.
- Virtual Registers have a register class assigned.
- If you have register constraints (like x86 8bit operations only work on al,ah,etc.) then you have to create a new register class to express that. (The only exception being limited to a single register, which instead we express by assigning the physreg directly instead of using a vreg).
- Tablegen may create more regsiter classes for register coalescing where we want to accomodate constraints of multiple instructions at the same time.
- All the information is in the .td file; you just have to put some effort into learning tablegen as the information is often expressed by using functions (i.e. the use add/sub/rotate/etc.) instead of just writing a table/list of registers).

- Matthias

Hello Matthias,

It's not that hard in principle:
- A register class is a set of registers.
- Virtual Registers have a register class assigned.
- If you have register constraints (like x86 8bit operations only work on
al,ah,etc.) then you have to create a new register class to express that.
(The only exception being limited to a single register, which instead we
express by assigning the physreg directly instead of using a vreg).
- Tablegen may create more regsiter classes for register coalescing where
we want to accomodate constraints of multiple instructions at the same time.
- All the information is in the .td file; you just have to put some effort
into learning tablegen as the information is often expressed by using
functions (i.e. the use add/sub/rotate/etc.) instead of just writing a
table/list of registers).

Thanks a lot for the response! The TableGen language is fairly
straightforward(at least commands used in X86 td file). However, some of
the comments about the classes didn't fully make sense: I suppose the
constraints are probably derived from the X86 assembly language and I
should look there? In addition, some classes don't have any comments(eg:
GR64_TCW64) and I couldn't find much info elsewhere.

Also, I don't know how the TableGen'erated classes are derived. Would
reading TableGen's code help understand why those were generated and what
constraints they encode? Or should I take another approach?

The “main” X86 register classes are pretty much

GR8 - 8-bit general purpose registers
GR16 - 16-bit general purpose registers
GR32 - 32-bit general purpose registers
GR64 - 64-bit general purpose registers
VR64 - mmx registers
VR128 - xmm0-xmm15 (vex and legacy sse encodable registers)
VR256 - ymm0-ymm15 (vex encodable registers)
VR512 - zmm0-zmm31(evex encodable registers)
VR128X - xmm0-xmm31 (evex encodable)
VR256X - ymm0-ymm31 (evex encodable)
FR32 - xmm0-xmm15 used for single precision floating point
FR32X - xmm0-xmm31 used for single precision floating point with evex encoding
FR64 - xmm0-xmm15 used for double precision floating point
FR64X - xmm0-xmm31 used for double precision floating point with evex encoding
VK* - mask registers
VK*WM - mask registers excluding k0 which can be used for write masking
RFP32 - single precision x87 floating point
RFP64 - double precision x87 floating point
RFP80 - extended precision x87 floating point
FR128 - Some 128-bit integer and floating point in xmm registers. Only partially supported since there is no hardware support for those types.

The following are used for specific instrutions that take certain registers that aren’t in the normal classes, but those instructions are never selected and the registers are never allocated. The instructions are only usable through inline assembly.
DEBUG_REG

SEGMENT_REG

CONTROL_REG

BNDRReg

The RST class is after the FP stackifier has converted RFP*.

CCR and FPCCR just contain the integer and floating point flag registers. These aren’t used for register allocation. But they are used by name in other places.

The others that contain subsets of the above exist for specific purposes. The best way to find their purpose if they don’t have a comment is to grep for them in the X86 directory. If you have specific questions about any of these I can try to help.

The GR64_TCW64 class doesn’t seem to be used in an instruction but for modeling an ABI constraint. Search the sourcecode you will find the only user:

const TargetRegisterClass *
X86RegisterInfo::getGPRsForTailCall(const MachineFunction &MF) const {
const Function *F = MF.getFunction();
if (IsWin64 || (F && F->getCallingConv() == CallingConv::Win64))
return &X86::GR64_TCW64RegClass;

(And I can’t tell you much more about; I don’t know windows ABIs)

You probably have to read the tablegen source to see everything. I can attempt to explain typical cases:

  • Common Sub Classes: For example when you need a vreg that works for both an GR32_NOAX and an GR32_NOSP operand at the same time, then you would need a class that contains GR32 without eax and without nosp. As that class is not defined the tablegen file itself but someone could be calling TargetRegisterInfo::getCommonSubClass(GR32_NOAX, GR32_NOSP) tablegen will precompute it and give it a name like “GR32_NOAX_and_GR32_NOSP”.
  • Similar for TargetRegisterInfo::getSubClassWithSubReg which returns a subclass containing all registers that support a specific subregister index. Again tablegen precompute all possible combinations and gives it a name like “GR64_with_sub_8bitReg”.
  • Matthias

Hello Matthias,

It's not that hard in principle:
- A register class is a set of registers.
- Virtual Registers have a register class assigned.
- If you have register constraints (like x86 8bit operations only work on
al,ah,etc.) then you have to create a new register class to express that.
(The only exception being limited to a single register, which instead we
express by assigning the physreg directly instead of using a vreg).
- Tablegen may create more regsiter classes for register coalescing where
we want to accomodate constraints of multiple instructions at the same time.
- All the information is in the .td file; you just have to put some
effort into learning tablegen as the information is often expressed by
using functions (i.e. the use add/sub/rotate/etc.) instead of just writing
a table/list of registers).

Thanks a lot for the response! The TableGen language is fairly
straightforward(at least commands used in X86 td file). However, some of
the comments about the classes didn't fully make sense: I suppose the
constraints are probably derived from the X86 assembly language and I
should look there? In addition, some classes don't have any comments(eg:
GR64_TCW64) and I couldn't find much info elsewhere.

The GR64_TCW64 class doesn’t seem to be used in an instruction but for
modeling an ABI constraint. Search the sourcecode you will find the only
user:

const TargetRegisterClass *
X86RegisterInfo::getGPRsForTailCall(const MachineFunction &MF) const {
  const Function *F = MF.getFunction();
  if (IsWin64 || (F && F->getCallingConv() == CallingConv::Win64))
    return &X86::GR64_TCW64RegClass;

(And I can’t tell you much more about; I don’t know windows ABIs)

Ah I see. Thanks for the explanations and tips Matthias and Craig! I will
inspect the source code to figure things out and ask again if I'm stuck
somewhere specifically.

Also, I don't know how the TableGen'erated classes are derived. Would
reading TableGen's code help understand why those were generated and what
constraints they encode? Or should I take another approach?

You probably have to read the tablegen source to see everything. I can
attempt to explain typical cases:

* Common Sub Classes: For example when you need a vreg that works for both
an GR32_NOAX and an GR32_NOSP operand at the same time, then you would need
a class that contains GR32 without eax and without nosp. As that class is
not defined the tablegen file itself but someone could be calling
TargetRegisterInfo::getCommonSubClass(GR32_NOAX, GR32_NOSP) tablegen will
precompute it and give it a name like “GR32_NOAX_and_GR32_NOSP”.
* Similar for TargetRegisterInfo::getSubClassWithSubReg which returns a
subclass containing all registers that support a specific subregister
index. Again tablegen precompute all possible combinations and gives it a
name like “GR64_with_sub_8bitReg”.

Thanks for this example - helps clear few things up! I will dive into
TableGen's code to understand the complete picture.