Physical subregister liveness


MachineVerifier allows a use of a physical register iff any of its subregisters is defined:

This means that a wide COPY like this is legal, even if only some of the individual subregsters ($sgpr0, $sgpr1, $sgpr2) are defined:

$vgpr1_vgpr2_vgpr3 = COPY killed $sgpr0_sgpr1_sgpr2

But if the target’s copyPhysReg splits it into multiple word-sized copies like this, then some of them may no longer be legal, because the corresponding source is completely undefined:

$vgpr3 = V_MOV_B32_e32 $sgpr2
$vgpr2 = V_MOV_B32_e32 $sgpr1
$vgpr1 = V_MOV_B32_e32 $sgpr0

Various targets (AMDGPU, Sparc, SystemZ) work around this in copyPhysReg by adding extra implicit operands to the word-sized copy instructions to satisfy MachineVerifier. But these extra operands make it look like there are dependencies between instructions that have none. This restricts post-RA scheduling freedom and confuses other late codegen passes. For example, AMDGPU actually adds all of these implicit operands for the example above [edited slightly to remove irrelevant stuff]:

$vgpr3 = V_MOV_B32_e32 $sgpr2, implicit-def $vgpr1_vgpr2_vgpr3, implicit $sgpr0_sgpr1_sgpr2
$vgpr2 = V_MOV_B32_e32 $sgpr1, implicit $sgpr0_sgpr1_sgpr2
$vgpr1 = V_MOV_B32_e32 $sgpr0, implicit killed $sgpr0_sgpr1_sgpr2

Because of this, I would like to find a better solution. I can think of three:

  1. Use subreg liveness information in copyPhysReg to only copy the parts of the wide register that are live. I tried that in ⚙ D113017 [AMDGPU] Avoid copying dead subregisters in copyPhysReg and it seems to work, but it also has a measurable (0.7%) compile time cost, which seems unfortunate.

  2. Change the physical subreg liveness rules to say that all parts of a physical register have to be defined. I’m not sure how to implement this, but I suppose it would mean we would need more IMPLICIT_DEF instructions to satisfy the verifier.

  3. Change the physical subreg liveness rules to say that no part of a physical register has to be defined. I guess this would be unpopular because it means we end up with no liveness verification at all.