Def/Kill flags for subregisters

I am trying to locate a bug that affects my Blackfin backend. I am having some trouble understanding the semantics of def/kill flags on machine operands when using subregisters.

I compile this function:

define void @i56_ls(i56 %x) nounwind {
  store i56 %x, i56* @i56_s
  ret void
}

And get this initial machine code:

Live Ins: %R0 %R1
  %reg1025D<def> = MOVE %R1
  %reg1024D<def> = MOVE %R0
  %reg1026D<def> = MOVE %reg1025D
  %reg1027P<def> = LOAD32imm <ga:i56_s>
  %reg1028D<def> = SRLd %reg1025D, 16
  %reg1029D16L<def> = EXTRACT_SUBREG %reg1026D, 1
  %reg1030P<def> = ADDimm7 %reg1027P, 4
  STORE16pi %reg1029D16L, %reg1030P, Mem:ST(2,4) [i56_s + 4]
  STORE8p_imm16 %reg1028D, %reg1027P, 6, Mem:ST(1,2) [i56_s + 6]
  STORE32p %reg1024D, %reg1027P, Mem:ST(4,4) [i56_s + 0]
  RTS %RETS<imp-use>

So far so good. After coalescing and allocating registers, I have this code:

Live Ins: %R0 %R1
  %P0<def> = LOAD32imm <ga:i56_s>
  %R2<def> = MOVE %R1<kill>
  %R2<def> = SRLd %R2, 16
  %P1<def> = MOVE %P0
  %P1<def> = ADDimm7 %P1, 4
  STORE16pi %R1L<kill>, %P1<kill>, Mem:ST(2,4) [i56_s + 4]
  STORE8p_imm16 %R2<kill>, %P0, 6, Mem:ST(1,2) [i56_s + 6]
  STORE32p %R0<kill>, %P0<kill>, Mem:ST(4,4) [i56_s + 0]
  RTS %RETS<imp-use>

The important part is:

  %R2<def> = MOVE %R1<kill>
  STORE16pi %R1L<kill>, %P1<kill>, Mem:ST(2,4) [i56_s + 4]

The register R1L is a subregister of R1 (the low 16 bits). There is also an R1H subregister, similar to AX/AL/AH in X86.

I assume that %R1<kill> means that now R1, R1L, and R1H are dead. The register scavenger works that way, and barfs on the code above.

What would be the correct kill flags in this situation? If I remove the kill flag from %R1<kill>, R1H is left alive.

How about this:

  %R2<def> = MOVE %R1, %R1H<imp-use,kill>
  STORE16pi %R1L<kill>, %P0<kill>, %R1<imp-use,kill>

According to comments in LiveVariables.cpp, not all of this is implemented. That probably explains my confusion. How should def/kill flags work with subregisters if they were fully implemented?

Here is my theory:

1. def/kill applies to all subregisters, so %R1<kill> will kill R1, R1L, and R1H
2. Liveness of a register can be correctly inferred from the def/kill flags. This seems to be the assumption of the register scavenger.
3. A register must be killed before it can be defined again.

Is this theory correct?

If so, it would be necessary to sprinkle extra <imp-use,kill> here and there, like I have done above.

Here is an X86 example:

  %EAX<def> = MOV32rm %ESP, 1, %noreg, 8, %noreg, Mem:LD(4,4) [FixedStack-2 + 0]
  %ECX<def> = MOV32rm %ESP, 1, %noreg, 4, %noreg, Mem:LD(4,16) [FixedStack-1 + 0]
  %EDX<def> = LEA32r %ECX, 1, %EAX, 0
  %EDX<def> = ADD32rr %EDX, %EAX<kill>, %EFLAGS<imp-def,dead>
  %EAX<def> = MOVZX32rr8 %CL<kill>
  %EAX<def> = ADD32rr %EAX, %EDX<kill>, %EFLAGS<imp-def,dead>
  RET %EAX<imp-use,kill>

This function defines ECX and kills CL, leaving ECX, CX, and CH still alive. It is not a problem here because ECX is not reused, but I think that if it were reused, it would not be properly killed first.

I am trying to locate a bug that affects my Blackfin backend. I am
having some trouble understanding the semantics of def/kill flags on
machine operands when using subregisters.

I compile this function:

define void @i56_ls(i56 %x) nounwind {
  store i56 %x, i56* @i56_s
  ret void
}

And get this initial machine code:

Live Ins: %R0 %R1
  %reg1025D<def> = MOVE %R1
  %reg1024D<def> = MOVE %R0
  %reg1026D<def> = MOVE %reg1025D
  %reg1027P<def> = LOAD32imm <ga:i56_s>
  %reg1028D<def> = SRLd %reg1025D, 16
  %reg1029D16L<def> = EXTRACT_SUBREG %reg1026D, 1
  %reg1030P<def> = ADDimm7 %reg1027P, 4
  STORE16pi %reg1029D16L, %reg1030P, Mem:ST(2,4) [i56_s + 4]
  STORE8p_imm16 %reg1028D, %reg1027P, 6, Mem:ST(1,2) [i56_s + 6]
  STORE32p %reg1024D, %reg1027P, Mem:ST(4,4) [i56_s + 0]
  RTS %RETS<imp-use>

So far so good. After coalescing and allocating registers, I have this
code:

Live Ins: %R0 %R1
  %P0<def> = LOAD32imm <ga:i56_s>
  %R2<def> = MOVE %R1<kill>
  %R2<def> = SRLd %R2, 16
  %P1<def> = MOVE %P0
  %P1<def> = ADDimm7 %P1, 4
  STORE16pi %R1L<kill>, %P1<kill>, Mem:ST(2,4) [i56_s + 4]
  STORE8p_imm16 %R2<kill>, %P0, 6, Mem:ST(1,2) [i56_s + 6]
  STORE32p %R0<kill>, %P0<kill>, Mem:ST(4,4) [i56_s + 0]
  RTS %RETS<imp-use>

The important part is:

  %R2<def> = MOVE %R1<kill>
  STORE16pi %R1L<kill>, %P1<kill>, Mem:ST(2,4) [i56_s + 4]

The register R1L is a subregister of R1 (the low 16 bits). There is
also an R1H subregister, similar to AX/AL/AH in X86.

I assume that %R1<kill> means that now R1, R1L, and R1H are dead. The
register scavenger works that way, and barfs on the code above.

Right.

What would be the correct kill flags in this situation? If I remove
the kill flag from %R1<kill>, R1H is left alive.

How about this:

  %R2<def> = MOVE %R1, %R1H<imp-use,kill>
  STORE16pi %R1L<kill>, %P0<kill>, %R1<imp-use,kill>

This is the correct except that R1L kill should not be there.

According to comments in LiveVariables.cpp, not all of this is
implemented. That probably explains my confusion. How should def/kill
flags work with subregisters if they were fully implemented?

Here is my theory:

1. def/kill applies to all subregisters, so %R1<kill> will kill R1,
R1L, and R1H
2. Liveness of a register can be correctly inferred from the def/kill
flags. This seems to be the assumption of the register scavenger.
3. A register must be killed before it can be defined again.

Is this theory correct?

Yes.

If so, it would be necessary to sprinkle extra <imp-use,kill> here and
there, like I have done above.

Here is an X86 example:

  %EAX<def> = MOV32rm %ESP, 1, %noreg, 8, %noreg, Mem:LD(4,4)
[FixedStack-2 + 0]
  %ECX<def> = MOV32rm %ESP, 1, %noreg, 4, %noreg, Mem:LD(4,16)
[FixedStack-1 + 0]
  %EDX<def> = LEA32r %ECX, 1, %EAX, 0
  %EDX<def> = ADD32rr %EDX, %EAX<kill>, %EFLAGS<imp-def,dead>
  %EAX<def> = MOVZX32rr8 %CL<kill>
  %EAX<def> = ADD32rr %EAX, %EDX<kill>, %EFLAGS<imp-def,dead>
  RET %EAX<imp-use,kill>

This function defines ECX and kills CL, leaving ECX, CX, and CH still
alive. It is not a problem here because ECX is not reused, but I think
that if it were reused, it would not be properly killed first.

I think you have probably run into a bug in one of the passes. It's probably the coalescer. It's tricky to get the sub-register liveness right.

Evan

Evan, thanks for clarifying.

I think you have probably run into a bug in one of the passes. It's
probably the coalescer. It's tricky to get the sub-register liveness
right.

I think you are right. The bad kill seems to be introduced by the coalescer. I will take a look at it.

I have written a machine code verifier pass that checks the def/kill rules, see PR408. Currently it finds a lot of multiply defined registers in targets using subregisters (X86 and Blackfin).

I would like to fix this in LiveVariables / LiveIntervals. Are there any major roadblocks I should be aware of?

/jakob

Evan, thanks for clarifying.

I think you have probably run into a bug in one of the passes. It's
probably the coalescer. It's tricky to get the sub-register liveness
right.

I think you are right. The bad kill seems to be introduced by the
coalescer. I will take a look at it.

I have written a machine code verifier pass that checks the def/kill
rules, see PR408. Currently it finds a lot of multiply defined
registers in targets using subregisters (X86 and Blackfin).

I would like to fix this in LiveVariables / LiveIntervals. Are there
any major roadblocks I should be aware of?

We have (sort of) plans to replace LiveVariables. One of which is splitting it into two (physical register liveness and virtual register liveness). The reason being virtual register liveness is computation is done globally while the physical register liveness is local. That may allow us to run the later multiple times without incurring too high a compile time cost (and simplify other passes).

The other idea is to replace liveness computation with a lazy / on demand version. You can search the archive for some discussions.

Of course, neither of these should stop you from fixing existing bugs. :slight_smile:

Evan

We have (sort of) plans to replace LiveVariables.

I see. I need to use subregisters and the register scavenger together, so I have to do something. However, if a new and improved LiveVariables is forthcoming, I can simply disable certain assertions in the register scavenger in my local tree. That will create less than optimal code, but it will work.

Of course, neither of these should stop you from fixing existing
bugs. :slight_smile:

I'll try to track down the premature kill in the coalescer. :slight_smile:

/jakob