[RFC] TableGen support for RegisterBankInfo
Introduction
TableGen generates boilerplate code from definition files. It already has some RegisterBankInfo support: TargetGenRegisterBankInfo(), RegBankIDs enum, RegBanks and Sizes. But it’s missing support for PartMappings, PartialMappingIdx and BankIDToCopyMapIdx. FWIW, I don’t think there’s enough TableGen information to emit ValMappings and ValueMappingIdx which are used in PowerPC, X86, AMDGPU and AArch64.
Current State
Currently, in order to provide the tables and enumerations that GlobalISel needs, this derivable code is written manually. Later, this code might even be copied and pasted into new backends and then modified, also manually. Sometimes this code is separated out into .def or .h files and sometimes it is just embedded directly into a target backend’s .cpp files. Nonstandard naming has crept in. The more custom code the backends adopt, the harder it will be to share code and develop utility routines and infrastructure.
Proposal
This all needs a cleanup which will not happen without some sturm und drang (encoding holes, subregs, …). However, the result will be cleaner existing GlobalISel implementations and will make it easier to write new backends.
Again, RegisterBankEmitter.cpp already has some RegisterBankInfo support. So the missing support can be added without needing any new source files, any new TargetRegisterBankInfo.inc files or changes to any of the CMakeLists.txt files. A target which needs RegisterBankInfo support already has RegisterBank support and consequently it already has a CMake rule (-gen-register-bank) that generates a TargetRegisterBank.inc file; the current RegisterBankInfo support piggybacks on RegisterBank. So the generated TargetRegisterBank.inc file now gets included 4 times rather than just 3.
There will be a new macro GET_TARGET_REGBANKINFO_DECLARATIONS which is necessary to control what gets included where.
This change won’t break any out-of-tree backends which may already have their own custom RegisterBankInfo support. Backends using this new support will have to adhere to the new TableGen conventions but those backends can adapt to the new regime at their convenience. In fact, we should be able to commit the new RegisterBankEmitter without any functional changes whatsoever. The in-tree backends can simply adapt the same as the out-of-tree backends, at their convenience. Of course, we’ll have to adapt one backend for testing but its changes won’t even have to be committed initially.
The current RegisterBankInfo support doesn’t clash with anything being proposed here and I expect to leave it alone. But at some point, the current support could be moved inside the REGBANKINFO #ifdefs for better readability.
Risks
The ordering of indices and tables will change. For example, the ARM backend currently has:
enum PartialMappingIdx {
PMI_GPR,
PMI_SPR,
PMI_DPR,
PMI_Min = PMI_GPR,
};
which is manually created (at the same time ignoring GPRwithAPSR, HPR and QPR) from:
def GPRRegBank : RegisterBank<"GPRB", [GPR, GPRwithAPSR]>;
def FPRRegBank : RegisterBank<"FPRB", [HPR, SPR, DPR, QPR]>;
but will be generated by TableGen as:
enum PartialMappingIdx {
PMI_None = -1,
PMI_HPR = 0,
PMI_SPR = 1,
PMI_DPR = 2,
PMI_QPR = 3,
PMI_GPR = 4,
PMI_GPRwithAPSR = 5,
};
Banks and RegisterClasses are not provided by TableGen in their original text order and so backends can’t depend on that or any particular total order. PartialMappingIdx is only partially ordered: all RCs of a bank are contiguous but in no particular order and Banks are in no particular order. BankIDToCopyMapIdx has the PartialMappingIdx of the first RC of each Bank.
PartialMappingIdx BankIDToCopyMapIdx[] = {
PMI_HPR,
PMI_GPR,
};
Refactoring
We can make an effort to factor out some of the utility routines which have been written in the various RegisterBankInfo implementations. These would be inside the GET_TARGET_REGBANKINFO_IMPL #ifdef where target backends can use them or ignore them. There will be no naming collisions unless the target defines GET_TARGET_REGBANKINFO_IMPL and includes the source. But really, anything interesting should probably get moved to the RegisterBankInfo base class.
Questions
Why is PMI_Min necessary? It seems especially unnecessary for TableGen generated enumerations which won’t have any encoding holes. PMI_Min is used in M68k, Mips, PowerPC and ARM where it’s 0 and also in AArch64 where it’s 7 (PMI_GPR32). It seems this AArch64 code was probably copied from AArch64 into the other backends where it’s just bloat.
Why is PMI_FPR16 == 1? Indeed, PartialMappingIdx in AArch64 is irregular. AMDGPU does this as well. Does AArch64 and AMDGPU really need this irregularity? This irregularity doesn’t appear in the X86 backend.
Why don’t AArch64 floating point registers have a register bank?
Why is AMDGPU PartialRegisterIdx prefix PM rather than PMI? Regularity makes it easier to compare code across GlobalISel implementations and TableGen helps enforce this regularity.
StartIdx? StartIdx is manually set to 0 in all of the backends. Is it that value the same as RC->RSI.getSimple().SpillAlignment?
CRRegBank needs to map to PMI_None in BankIDToCopyMapIdx. This would be simpler if CRRC was marked “let isAllocatable = 0;” or some other marking.
SubRegs?
RISCV is complicated because of def XLenVT : ValueTypeByHwMode<[RV32, RV64], [i32, i64]>; So getSimple().RegSize crashes. We need a way to iterate through HwModes. This is an issue with BankIDToCopyMapIdx
PowerPC’s PartialMappingIdx in PPCRegBankInfo.h doesn’t match the Register Classes in PPCRegisterBanks.td. BankIDToCopyMapIdx is used in PPC and AArch64, getCopyMapping().