TLDR: I created a pass that ties a register to its sub-register and runs before twoaddressinstruction
. What could go wrong and untie the registers?
I have seen some discussion already on how to tie register to its sub-register, e.g. here and here. In the current state, it seems to me there is no clear solution yet, and one has to be conservative.
I am working with an instruction which takes a composite register as input, and overwrites one of its sub-registers. I summarized the problem in the snippet below:
foreach Index = 0...255 in {
def R#Index : Register <"r"#Index>;
}
def GPR32 : RegisterClass<"TestTarget", [i32], 32,
(add (sequence "R%u", 0, 255))>;
def sub_128_0 : SubRegIndex<32, 0>;
def sub_128_1 : SubRegIndex<32, 32>;
def sub_128_2 : SubRegIndex<32, 64>;
def sub_128_3 : SubRegIndex<32, 96>;
def GPR128 : RegisterTuples<[sub_128_0, sub_128_1, sub_128_2, sub_128_3],
[
(decimate (shl GPR32, 0), 1),
(decimate (shl GPR32, 1), 1),
(decimate (shl GPR32, 2), 1),
(decimate (shl GPR32, 3), 1)
]>;
// TODO: Constraints = "$sub0_out = $reg.sub_128_0"
def FOO : Instruction<
(outs GPR32:$sub0_out), (ins GPR128:$reg),
[], "foo ", "$reg">;
I want to encode the fact that my output $sub0_out
register actually corresponds to the sub_128_0 subregister of the input $reg
. For practical and performance reasons, I cannot just say that the whole register is overwritten.
What I’ve been doing so far is mimicking what is done for “standard” tied registers. As I understand it, the main handling of such tied registers is done in the twoaddressinstruction
pass. For each tied register pair, the latter will insert a copy of the source register to the destination register before the instruction, and then replace the source operand with the destination register in the instruction.
%0 = TIED_SRC_DST %1
---> after twoaddressinstruction:
%0 = COPY %1
%0 = TIED_SRC_DST %0
I created a pass which does a similar thing for my tied sub-reg problem, let’s call it subregconstrainer
. It’s currently running after PHI node elimination, and before twoaddressinstruction. It will insert a COPY of the source register (%1) to a new scratch register (%10). Then, it will replace all the uses of the destination register with %10.sub_128_0. The source register (%1), is also replaced with %10. Essentially, the following happens:
%0 = FOO %1
SOME_INSTR %0
---> after subregconstrainer:
%10 = COPY %1
%10.sub_128_0 = FOO %10
SOME_INSTR %10.sub_128_0
My question is: what could go wrong? I tested multiple scenarios, and my registers always stayed tied. I had a look at the passes involved in register allocation (mainly twoaddressinstruction, regcoalescer, greedyregalloc, fastregalloc), and couldn’t spot something which would immediatly go wrong and untie my registers. But I don’t know a lot about the register allocators in llvm, and even less about their hidden assumptions. I’m afraid they could insert copies (to e.g. split live ranges?) and replace my operands. The only thing that reassures me is that hasTiedOperand()
is currently not queried that much during register allocation.
My current belief is that, as long as I use the same virtual reg as in/out (although, the out operand will have a subreg index), I’m somewhat safe. I also used hasExtraSrcRegAllocReq = true, hasExtraDefRegAllocReq = true
to prevent the registers from being renamed after register allocation.
Wrapping it up:
- Could someone confirm that an instruction like
%10.sub_128_0 = FOO %10
will not be rewritten, and I will get something like$R0 = FOO $R0_R1_R2_R3
after regalloc? - What are the particular contracts/assumptions that the different register allocation passes have between themselves, in particular regarding tied operands?
FYI @qcolombet @MatzeB