Hello,
I met this problem when I working for FP16 emulation on SSE2.
We don’t have a single instruction on SSE2 to store a 16 bit element in XMM register into memory. Instead, we have to use a 32 bit GRP as a temp register to help with it, e.g.,
movss %xmm0, %eax
movw %ax, (%esp)
However, in fast RA, after we spilling the FR16 type, we don’t have chance to allocate physical register for the new created one. For example, we have a simple IR:
define dso_local half @foo(ptr %0) {
%2 = load half, ptr %0
%3 = call half @foo(ptr %0)
ret half %2
}
When compiled with D107082 and llc -O0 -mtriple=i386 -mattr=sse2 < foo.ll -debug
, we will get such MIR before fast RA:
Allocating bb.0 (%ir-block.1):
%0:gr32 = MOV32rm %fixed-stack.0, 1, $noreg, 0, $noreg :: (load (s32) from %fixed-stack.0)
%4:vr128 = IMPLICIT_DEF
%3:vr128 = COPY %4:vr128
%3:vr128 = PINSRWrm %3:vr128(tied-def 0), %0:gr32, 1, $noreg, 0, $noreg, 0 :: (load (s16) from %ir.0)
%5:fr16 = COPY %3:vr128
%1:fr16 = COPY %5:fr16
ADJCALLSTACKDOWN32 4, 0, 0, implicit-def $esp, implicit-def $eflags, implicit-def $ssp, implicit $esp, implicit $ssp
MOV32mr $esp, 1, $noreg, 0, $noreg, %0:gr32 :: (store (s32) into stack)
CALLpcrel32 @foo, <regmask $bh $bl $bp $bph $bpl $bx $di $dih $dil $ebp $ebx $edi $esi $hbp $hbx $hdi $hsi $si $sih $sil>, implicit $esp, implicit $ssp, implicit-def $xmm0
ADJCALLSTACKUP32 4, 0, implicit-def $esp, implicit-def $eflags, implicit-def $ssp, implicit $esp, implicit $ssp
%2:fr16 = COPY $xmm0
$xmm0 = COPY %1:fr16
RET32 implicit $xmm0
Then the %1:fr16
is spilling:
>> %1:fr16 = COPY %5:fr16
Regs: AL=%0 HAX=%0
Search register for %1 in class FR16 with hint $noreg
Register: $xmm0 Cost: 0 BestCost: 4294967295
Assigning %1 to $xmm0
Spill Reason: LO: 0 RL: 1
Spilling %1 in $xmm0 to stack slot #0
Freeing $xmm0: %1
Search register for %5 in class FR16 with hint $xmm0
Preferred Register 1: $xmm0
Assigning %5 to $xmm0
Finally, we got all VR allocated but the temp one:
Begin Regs:
Loading live registers at begin of block.
bb.0 (%ir-block.1):
renamable $eax = MOV32rm %fixed-stack.0, 1, $noreg, 0, $noreg :: (load (s32) from %fixed-stack.0)
renamable $xmm0 = IMPLICIT_DEF
renamable $xmm0 = PINSRWrm renamable $xmm0(tied-def 0), renamable $eax, 1, $noreg, 0, $noreg, 0 :: (load (s16) from %ir.0)
%6:gr32_nosp = MOVPDI2DIrr $xmm0
MOV16mr %stack.0, 1, $noreg, 0, $noreg, %6.sub_16bit:gr32_nosp :: (store (s16) into %stack.0)
ADJCALLSTACKDOWN32 4, 0, 0, implicit-def $esp, implicit-def dead $eflags, implicit-def $ssp, implicit $esp, implicit $ssp
MOV32mr $esp, 1, $noreg, 0, $noreg, killed renamable $eax :: (store (s32) into stack)
CALLpcrel32 @foo, <regmask $bh $bl $bp $bph $bpl $bx $di $dih $dil $ebp $ebx $edi $esi $hbp $hbx $hdi $hsi $si $sih $sil>, implicit $esp, implicit $ssp, implicit-def $xmm0
ADJCALLSTACKUP32 4, 0, implicit-def $esp, implicit-def dead $eflags, implicit-def $ssp, implicit $esp, implicit $ssp
dead renamable $xmm1 = COPY $xmm0
$xmm0 = PINSRWrm undef $xmm0(tied-def 0), %stack.0, 1, $noreg, 0, $noreg, 0 :: (load (s16) from %stack.0)
RET32 implicit killed $xmm0
Remaining virtual register operands
UNREACHABLE executed at /export/users2/pengfeiw/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:207!
Notice the %6:gr32_nosp
is leaving as VR, thus we got an error of “Remaining virtual register operands”.
I’m not sure if other targets have the similar case here. Do we have predecessors? Thank you for any points!