Physical subregister liveness

(Cross posting from discourse)

Hi,

MachineVerifier allows a use of a physical register iff any of its
subregisters is defined:

This means that a wide COPY like this is legal, even if only some of
the individual subregsters ($sgpr0, $sgpr1, $sgpr2) are defined:

$vgpr1_vgpr2_vgpr3 = COPY killed $sgpr0_sgpr1_sgpr2

But if the target’s copyPhysReg splits it into multiple word-sized
copies like this, then some of them may no longer be legal, because
the corresponding source is completely undefined:

$vgpr3 = V_MOV_B32_e32 $sgpr2
$vgpr2 = V_MOV_B32_e32 $sgpr1
$vgpr1 = V_MOV_B32_e32 $sgpr0

Various targets (AMDGPU, Sparc, SystemZ) work around this in
copyPhysReg by adding extra implicit operands to the word-sized copy
instructions to satisfy MachineVerifier. But these extra operands make
it look like there are dependencies between instructions that have
none. This restricts post-RA scheduling freedom and confuses other
late codegen passes. For example, AMDGPU actually adds all of these
implicit operands for the example above [edited slightly to remove
irrelevant stuff]:

$vgpr3 = V_MOV_B32_e32 $sgpr2, implicit-def $vgpr1_vgpr2_vgpr3,
implicit $sgpr0_sgpr1_sgpr2
$vgpr2 = V_MOV_B32_e32 $sgpr1, implicit $sgpr0_sgpr1_sgpr2
$vgpr1 = V_MOV_B32_e32 $sgpr0, implicit killed $sgpr0_sgpr1_sgpr2

Because of this, I would like to find a better solution. I can think of three:

1. Use subreg liveness information in copyPhysReg to only copy the
parts of the wide register that are live. I tried that in D113017
"[AMDGPU] Avoid copying dead subregisters in copyPhysReg" and it seems
to work, but it also has a measurable (0.7%) compile time cost, which
seems unfortunate.

2. Change the physical subreg liveness rules to say that all parts of
a physical register have to be defined. I’m not sure how to implement
this, but I suppose it would mean we would need more IMPLICIT_DEF
instructions to satisfy the verifier.

3. Change the physical subreg liveness rules to say that no part of a
physical register has to be defined. I guess this would be unpopular
because it means we end up with no liveness verification at all.

Thoughts?

Thanks,
Jay.