Improving the split heuristics for the Greedy Register Allocator

I have an issue that I've been wrestling with for quite some time and I'm
hoping that someone with a deeper understanding of the register allocator
can help me with.

Namely, I am trying to teach RA to split a live range rather than
allocating a CSR. I've attempted a very large number of tweaks to the costs
(both existing and experimental ones that I've added). However, despite all
of that, I can't seem to get RA to split the following:

  1 BB#0: derived from LLVM BB %entry
  2 Live Ins: %X3
  3 %vreg15<def> = COPY %X3; G8RC:%vreg15
  4 %vreg4<def> = CMPLDI %vreg15, 0; CRRC:%vreg4 G8RC:%vreg15
  5 %vreg11:sub_32<def,read-undef> = LI 0; G8RC:%vreg11
  6 BCC 68, %vreg4, <BB#1>; CRRC:%vreg4
  7 Successors according to CFG: BB#4(0x30000000 / 0x80000000 = 37.50%)
BB#1(0x50000000 / 0x80000000 = 62.50%)
  8
  9 BB#4:
10 Predecessors according to CFG: BB#0
11 B <BB#3>
12 Successors according to CFG: BB#3(?%)
13
14 BB#1: derived from LLVM BB %if.end
15 Predecessors according to CFG: BB#0
16 %vreg6<def> = ADDIStocHA %X2, <ga:@a>; G8RC_and_G8RC_NOX0:%vreg6
17 %vreg7<def> = LDtocL <ga:@a>, %vreg6, %X2<imp-use>;
mem:LD8[GOT] G8RC_and_G8RC_NOX0:%vreg7,%vreg6
18 %vreg8<def> = LWA 0, %vreg7;
mem:LD4[@a](tbaa=!3)(dereferenceable) G8RC:%vreg8 G8RC_and_G8RC_NOX0:%vreg7
19 %vreg9<def> = CMPLD %vreg8, %vreg15; CRRC:%vreg9
G8RC:%vreg8,%vreg15
20 BCC 68, %vreg9, <BB#3>; CRRC:%vreg9
21 B <BB#2>
22 Successors according to CFG: BB#2(0x30000000 / 0x80000000 = 37.50%)
BB#3(0x50000000 / 0x80000000 = 62.50%)
23
24 BB#2: derived from LLVM BB %if.then2
25 Predecessors according to CFG: BB#1
26 ADJCALLSTACKDOWN 96, %R1<imp-def,dead>, %R1<imp-use>
27 %vreg16<def> = COPY %vreg15; G8RC:%vreg16,%vreg15
28 BL8_NOP <ga:@callVoid>, <regmask **LONG LIST**>,
%X3<imp-def,dead>
29 ADJCALLSTACKUP 96, 0, %R1<imp-def,dead>, %R1<imp-use>
30 ADJCALLSTACKDOWN 96, %R1<imp-def,dead>, %R1<imp-use> 31
%X3<def> = COPY %vreg16; G8RC:%vreg16
32 BL8_NOP <ga:@callNonVoid>, <regmask **LONG LIST**>,
%X3<imp-use>, %X2<imp-use>, %R1<imp-def>, %X3<imp-def>
33 ADJCALLSTACKUP 96, 0, %R1<imp-def,dead>, %R1<imp-use>
34 %vreg11<def> = COPY %X3; G8RC:%vreg11 35 Successors
according to CFG: BB#3(?%)
36
37 BB#3: derived from LLVM BB %return
38 Predecessors according to CFG: BB#1 BB#2 BB#4
39 %vreg12<def> = EXTSW_32_64 %vreg11:sub_32; G8RC:%vreg12,%vreg11
40 %X3<def> = COPY %vreg12; G8RC:%vreg12
41 BLR8 %LR8<imp-use>, %RM<imp-use>, %X3<imp-use>

No matter what I do, vreg15 will get a Callee-Saved Register assigned to
it. However, this is suboptimal. So what I am trying to accomplish is to
split the live range of vreg15 into the paths without the call and the path
with the call (BL8_NOP is a call). Then the physical register X3 can be
used in the paths BB#0 -> BB#1 -> BB#3 and BB#0 -> BB#4 -> BB#3 and it can
be copied to a Callee-Saved Register in BB#2.

Hi Nemanja,

vreg15's live range is not across call. It is weird that it is always
getting CSR. Maybe because of the hint of vreg16, i.e., the
contribution of hint outweigh the cost of CSR first use?

Is it possible to send out the testcase? I can try if there is anyway
to allocate vreg15 to CSR.

Thanks,
Wei.

Hmm… it would appear that the behaviour on that original test case changes with ToT. However, this test case will still allocate a CSR in the entry block even though it really does not need to. And adjusting the CSR first-time-use cost does not seem to have any effect on it.

int a;
int callVoid();
int callNonVoid(int
);
int test(int *b) {
if (b == a) {
callVoid();
return callNonVoid(b);
}
return 0;
}

Hi Nemanja,

Thanks for the testcase. I can reproduce your problem. splitting
actually works as we expect but tryHintRecoloring coalesces the
splitted vregs and revokes the effect of splitting. The simple way is
to stop tryHintRecoloring for vregs colored with CSR registers, but I
am still thinking if there is better way to solve it. Will keep you
updated.

Thanks,
Wei.

I replied my finding in https://reviews.llvm.org/D27366 since it is
easier to get feedback from people follows it.

Thanks,
Wei.