Allen
March 6, 2022, 9:41am
1
dear, all
base on the MIR of liveintervals
# *** IR Dump After Live Interval Analysis (liveintervals) ***:
# Machine code for function foo: NoPHIs, TracksLiveness, TiedOpsRewritten
0B bb.0.entry:
16B %0:fpr64 = MOVIv2i32 79, 24
32B %1:fpr32 = COPY %0.ssub:fpr64
48B $s0 = COPY %1:fpr32
64B RET_ReallyLR implicit killed $s0
I get two dump version of simple-register-coalescing, but I don’t know which one should be better for performance?
– 1st version
0B bb.0.entry:
16B %0:fpr64 = MOVIv2i32 79, 24
48B $s0 = COPY %0.ssub:fpr64
64B RET_ReallyLR implicit killed $s0
–2nd version
0B bb.0.entry:
48B dead $d0 = MOVIv2i32 79, 24, implicit-def $s0
64B RET_ReallyLR implicit killed $s0
Allen
March 7, 2022, 1:57am
2
more detail info in pass SIMPLE REGISTER COALESCING
– 1st version
********** SIMPLE REGISTER COALESCING **********
********** Function: foo
********** JOINING INTERVALS ***********
entry:
48B $s0 = COPY %1:fpr32
Considering merging %1 with $s0
Can only merge into reserved registers.
32B %1:fpr32 = COPY %0.ssub:fpr64
Considering merging to FPR64 with %1 in %0:ssub
RHS = %1 [32r,48r:0) 0@32r weight:0.000000e+00
LHS = %0 [16r,32r:0) 0@16r weight:0.000000e+00
merge %1:0@32r into %0:0@16r --> @16r
erased: 32r %1:fpr32 = COPY %0.ssub:fpr64
AllocationOrder(FPR64) = [ $d0 $d1 $d2 $d3 $d4 $d5 $d6 $d7 $d16 $d17 $d18 $d19 $d20 $d21 $d22 $d23 $d24 $d25 $d26 $d27 $d28 $d29 $d30 $d31 $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 ]
updated: 48B $s0 = COPY %0.ssub:fpr64
Success: %1:ssub -> %0
Result = %0 [16r,48r:0) 0@16r weight:0.000000e+00
48B $s0 = COPY %0.ssub:fpr64
Considering merging %0 with $d0
Can only merge into reserved registers.
Remat: dead $d0 = MOVIv2i32 79, 24, implicit-def $s0
Shrink: %0 [16r,48r:0) 0@16r weight:0.000000e+00
All defs dead: 16r dead %0:fpr64 = MOVIv2i32 79, 24
Shrunk: %0 [16r,16d:0) 0@16r weight:0.000000e+00
Deleting dead def 16r dead %0:fpr64 = MOVIv2i32 79, 24
Trying to inflate 0 regs.
********** INTERVALS **********
RegMasks:
********** MACHINEINSTRS **********
# Machine code for function foo: NoPHIs, TracksLiveness, TiedOpsRewritten
0B bb.0.entry:
48B dead $d0 = MOVIv2i32 79, 24, implicit-def $s0
64B RET_ReallyLR implicit killed $s0
– 2st version
********** SIMPLE REGISTER COALESCING **********
********** Function: foo
********** JOINING INTERVALS ***********
entry:
48B $s0 = COPY %1:fpr32
Considering merging %1 with $s0
Can only merge into reserved registers.
32B %1:fpr32 = COPY %0.ssub:fpr64
Considering merging to FPR64 with %1 in %0:ssub
RHS = %1 [32r,48r:0) 0@32r weight:0.000000e+00
LHS = %0 [16r,32r:0) 0@16r weight:0.000000e+00
merge %1:0@32r into %0:0@16r --> @16r
erased: 32r %1:fpr32 = COPY %0.ssub:fpr64
AllocationOrder(FPR64) = [ $d0 $d1 $d2 $d3 $d4 $d5 $d6 $d7 $d16 $d17 $d18 $d19 $d20 $d21 $d22 $d23 $d24 $d25 $d26 $d27 $d28 $d29 $d30 $d31 $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 ]
updated: 48B $s0 = COPY %0.ssub:fpr64
Success: %1:ssub -> %0
Result = %0 [16r,48r:0) 0@16r weight:0.000000e+00
48B $s0 = COPY %0.ssub:fpr64
Considering merging %0 with $d0
Can only merge into reserved registers.
Trying to inflate 0 regs.
********** INTERVALS **********
%0 [16r,48r:0) 0@16r weight:0.000000e+00
RegMasks:
********** MACHINEINSTRS **********
# Machine code for function foo: NoPHIs, TracksLiveness, TiedOpsRewritten
0B bb.0.entry:
16B %0:fpr64 = MOVIv2i32 79, 24
48B $s0 = COPY %0.ssub:fpr64
64B RET_ReallyLR implicit killed $s0
# End machine code for function foo.
Can you explain the context? In your particular example, the two versions should produce identical code after register allocation.
Allen
March 25, 2022, 6:41am
4
Thanks for your attention.
Yes, they’ll produce identical code finally as this case is very small(it only have one insn) , but I’m curious about which will be more fit the register allocation.
the 1st version seems use 2 register %0 and $s0 (also I don’t known why not allocate %0 with an physical register), and the 2nd version seems use 2 register $s0 and $d0.
I think there are a couple presentations on youtube about LLVM register allocation, for more general context. See also The LLVM Target-Independent Code Generator — LLVM 16.0.0git documentation .
%0 is a virtual register; this is before register allocation, so the location hasn’t been decided.
$s0 and $d0 are different names for the same register.
Basically, register coalescing is a pre-pass to assist register allocation: it removes dimensions from the problem of register allocation so the allocator is less likely to make bad decisions. But we don’t usually like to coalesce virtual registers with physical registers. If we need that specific register for something else, the result might be impossible, or cause extra spilling.
1 Like