subregs in trivial coalescing

I’m running into a problem with subregs during trivial coalescing in the linear scan allocator.

Should RALinScan::attemptTrivialCoalescing be allowed to coalesce a COPY that uses a subreg as a destination?

I’ve got the following sequence of code (unfortunately for an out of tree target) that is moving 32 and 64 bit sub-registers around within a 128 bit register. By the time the register allocator runs the code looks like:

92L %reg16402:dsub0 = DEF64… %reg16402, QPR:%reg16402…
116L %reg16405:sub0 = COPY %reg16402:sub1, %reg16405; QPR:%reg16405,16402
124L %reg16413:sub0 = COPY %reg16405:sub0; QPR:%reg16413,16405
… stuff …
468L %reg16460:sub3 = COPY %reg16402:sub0; QPR:%reg16460,16402

dsub0/dsub1 are 64-bit subregs and sub0/1/2/3 are 32 bit subregs. DEF64 is just representative of a 64 bit ALU operation.

The code is correct at this point.

Q1 (a 128-bit physical register) is assigned to %reg16402 which is ok but then RALinScan::attemptTrivialCoalescing thinks it can coalesce r16405 with r16402 giving:

92L Q1:dsub0 = DEF64 %reg16402, QPR:%reg16402
116L Q1:sub0 = COPY Q1:sub1, %reg16405; QPR:%reg16405,16402
124L %reg16413:sub0 = COPY Q1:sub0; QPR:%reg16413,16405
… stuff …
468L %reg16460:sub3 = COPY Q1:sub0; QPR:%reg16460,16402

which is wrong: Q1:sub0 (in reg16402) is overwitten at 116 before it’s last use at 468.

Trivial coalescing doesn’t have any check for subregs so I’m assuming that something is broke. I’ve patched things up by rejecting any trivial coalescing attempt where the COPY has a subregister as it’s target (none of the tests for arm and x86 hit this case).

Does that seem right? Not sure if I’ve misused subregs or if the input to the RA is incorrect?


It looks like you found a bug.

This code has probably been tested mostly with x86 code where such a sub-register mismatch is not possible.

The trivial coalescing after allocation is a bit fragile, and I think you are right to disable it when the copy destination is a sub-register.

Do you have a patch?