Eliminate PHI for non-copyable registers

In my hardware there are two special registers cannot be copied but can only be assigned and referenced (read) in the other instruction. They are allocatable also.

br i1 %if_cond, label %then, label %else
then:
%x1 = fptosi float %y1 to i32
br label %endif
else:
%x2 = fptosi float %y2 to i32
br label %endif
endif:
%x3 = phi i32 [%x1, %then], [%x2, %else]

PNE::LowerAtomiPHINode() fails because TargetInstrInfo::copyRegToReg() doesn’t support the copy of this type of register.

Most registers of this hardware are f32. These two special register of type i32 are provided to relative index the other f32 registers. The value of these i32 registers can only be written by a FP-to-INT conversion instruction. But these two i32 registers are not designed to be copied from one to the other.

Alex.

This is a very interesting problem. If you have registers like this, they should be non-allocatable (just like 'flags') which means that you don't have to define copy operations for them.

-Chris

Chris Lattner-2 wrote:

In my hardware there are two special registers cannot be copied but
can only be assigned and referenced (read) in the other instruction.
They are allocatable also.

br i1 %if_cond, label %then, label %else
then:
  %x1 = fptosi float %y1 to i32
  br label %endif
else:
  %x2 = fptosi float %y2 to i32
  br label %endif
endif:
  %x3 = phi i32 [%x1, %then], [%x2, %else]

PNE::LowerAtomiPHINode() fails because
TargetInstrInfo::copyRegToReg() doesn't support the copy of this
type of register.

Most registers of this hardware are f32. These two special register
of type i32 are provided to relative index the other f32 registers.
The value of these i32 registers can only be written by a FP-to-INT
conversion instruction. But these two i32 registers are not designed
to be copied from one to the other.

This is a very interesting problem. If you have registers like this,
they should be non-allocatable (just like 'flags') which means that
you don't have to define copy operations for them.

They "should" be non-allocatable if the hardware implements the same number
of these i32 registers as the "specification". The input language (which is
converted to LLVM IR) may use up to 4 registers but the hardware only has 2.
So they must be allocatable, right?

For example, the input uses up to 3 <i32> registers INT0, INT1, INT2 (Rx are
FP registers):

  fp2int INT0, R0
  fp2int INT1, R1
  fp2int INT2, R2
  add R0, R0, R[INT1+1]
  mul R0, R[INT2+2], R[INT0+1]

Since the hardware doesn't has INT2, the final machine should be like:

  fp2int INT0, R0
  fp2int INT1, R1
  add R0, R0, R[INT1+1]
  fp2int INT1, R2 <==== rename INT2 to INT1
  mul R0, R[INT1+2], R[INT0+1]

I use the method suggested in "Kaleidoscope: Extending the Language: Mutable
Variables" (http://llvm.org/docs/tutorial/LangImpl7.html) and rely on
mem2reg to promote these loads to registers.

By the way, all registers are non-spillable.

To be allocatable, the code generator must be able to emit copies into and out of the registers and must be able to spill them, even if it means going through another temporary register class.

-Chris

Chris Lattner-2 wrote:

and out of the registers and must be able to spill them, even if it
means going through another temporary register class.

But what if it cannot even be copied to another temporary register class?

The values of these i32 regsiters can only be used as the index of another
register class, but the value of the index itself cannot be read.

Usually the program can be generated using only 2 of these i32 index
registers, but the problem is LLVM requires them to be copyable if there is
a PHI node.

Then they are not allowed to be allocatable.

-Chris