InlineSpiller.cpp bug?

Hi,

I have encountered a test case where InlineSpiller generates bad code. A register is reloaded but never spilled, and I suspect a bug in InlineSpiller.

A register is live over a loop that also have two inner loops. It is not used or defined over the inner loops. It is split into two sibling registers, where one covers just the inner loops interval, which is then spilled.

In spill(), analyzeSiblingValues() is called, which calls traceSiblingValue(). It traces in several iterations strangely back to the same register (inside a loop), finds it marked to be spilled, and the spill is cancelled:

Inline spilling %vreg86 [1396r,2276r:0) 0@1396r

From original %vreg76

Tracing value %vreg86:0@1396r

%vreg86:0@1396r: copy of %vreg87:6@1168r kill=1

%vreg87:6@1168r: copy of %vreg87:5@1120B kill=1

%vreg87:5@1120B: split phi value, checking 1 phi-defs, and 2 non-phi/orig defs

%vreg87:7@2276r: copy of %vreg86:0@1396r kill=1

traced to: spill %vreg86:0@1396r all-reloads kill deps[ 7@2276r ]

Merged spilled regs: SS#1 [1396r,2276r:0) 0@x

I am guessing that traceSiblingValue() should have stopped at a PHI ValNo by recognizing it as ‘original’, meaning it was not inserted by splitting. It does not although it is clear that this PHI ValNo is part of OrigLI.

This is how it looked, roughly:

Original LiveInterval:

5 0 — inner loops — // last valno in interval

PHI COPY of valno 5

After splitting:

5 6 7 8

PHI COPY of valno 5 COPY of valno 0 PHI

0

COPY of valno 6

/\

OrigVNI

Tracing sibling values, valno 6 is the original valno:

Valno 0 is a copy from 6.

Valno 6 is a copy from 5.

Valno 5 is a phi. Is it OrigVNI? NO! ‘Therefore it is not an original phi.’ WRONG! The search continues past it, assuming that it is a newly inserted PHI, done during splitting.

My conlusion is that either the line

if (VNI->def == OrigVNI->def)

is wrong, because it doesn’t really check that VNI was not a phi in OrigLI, because it assumes that if VNI is a phi which was part of the original LI, then OrigVNI must be that PHI. This is not the case here.

The algorithm has iterated past OrigVNI and VNI is at this point a phi that was part of the original LI, which is the copy source of OrigVNI.

Or, it is assumed that splitting is always done at PHI ValNos somehow, and not as in this case, by a COPY of a PHI ValNo.

I would very much appreciate assistance in resolving this problem. As explained above, it is not clear to me why this error occurs, or what is the appropriate fix. I can provide more details if needed for some reason.

Best regards,

Jonas Paulsson

[+Lang]

Hi Jonas,

Could you share your test case and/or file a PR?

I’d like to see what is going on dynamically to help you.

Thanks,
-Quentin

Hi Quentin,

I have tried to find a test case for an official target, but failed. It seems to be a rare case.

To do it, I added the ‘else’ clause in the following:

if (VNI->def == OrigVNI->def) {

DEBUG(dbgs() << “orig phi value\n”);

SVI->second.DefByOrigPHI = true;

SVI->second.AllDefsAreReloads = false;

propagateSiblingValue(SVI);

continue;

}

// check if the valno is actually an orig PHI, but is not OrigVNI

else {

LiveInterval &OrigLI = LIS.getInterval(Original);

VNInfo *OrigVNI_curr = OrigLI.getVNInfoAt(VNI->def);

if (OrigVNI_curr->def == VNI->def)

assert(0 && “OrigLI contained VNI which was a PHI, but not OrigVNI!”);

}

, but the assert never triggered anywhere else than in my original case.

When I did the following, at least my test case worked correctly:

LiveInterval &OrigLI = LIS.getInterval(Original);

VNInfo *OrigVNI_curr = OrigLI.getVNInfoAt(VNI->def);

if (OrigVNI_curr->def == VNI->def) {

DEBUG(dbgs() << “orig phi value\n”);

SVI->second.DefByOrigPHI = true;

SVI->second.AllDefsAreReloads = false;

propagateSiblingValue(SVI);

continue;

}

The question is: is this sound and safe?

It seems logical to me, but I would really appreciate some expert advice,

Jonas

Hi Jonas,

Thanks for your patience.

I have looked into the problem you reported and although the fix you proposed seem correct, I am not sure yet this is the way to go. I would need the debug output of the regalloc to help you further, but here are my initial findings.

traceSiblingValue looked into the problematic phi because the previous iteration said that %vreg87:6 and %vreg87:5 are sibling.
Since they are sibling it means they share the same original vreg.
The problem is that if vreg87:5, i.e., the original phi, is not the original value then it must have been inserted by splitting.
Based on what you said this is not the case, so we end up in this weird situation.

If you cannot share the debug output, you could check the following for me:
1. how vreg76 gets split?
2. what is the second non-phi/defs for vreg87:5?
3. if vreg87:5 is not inserted by splitting, how does it gets here?

Thanks,
-Quentin

Hi Quentin,

I have rerun the test case on a recent commit, so the numbers have changed. There are also now a few more basic blocks very small basic blocks in the function, and therefore there are some slight differences. I tried to go back to earlier commits, without success for some reason… This is however very similar, except that there becomes two COPYs back to sibling value after the loop. My apologies [vregs 76->111, 87->122].

The interval for %vreg111 first covers nearly the entire function. Then it gets split into two intervals, where one covers the inner loops, which makes sense.

selectOrSplit %vreg111 [68r,400B:1)[400B,688r:6)[688r,752B:4)[752B,1264r:6)[1264r,1312r:3)[1312r,1472B:2)[1472B,1520r:5)[1520r,3488B:0) 0@1520r 1@68r 2@1312r 3@1264r 4@688r 5@1472B-phi 6@400B-phi w=3.181050e-02

queuing new interval: %vreg121 [1764r,2936r:0)[2960B,2980r:0) 0@1764r

queuing new interval: %vreg122 [68r,400B:0)[400B,688r:1)[688r,752B:2)[752B,1264r:1)[1264r,1312r:3)[1312r,1472B:4)[1472B,1520r:5)[1520r,1764r:6)[2936r,2960B:7)[2980r,2988B:8)[2988B,3024B:9)[3024B,3488B:10) 0@68r 1@400B-phi 2@688r 3@1264r 4@1312r 5@1472B-phi 6@1520r 7@2936r 8@2980r 9@2988B-phi 10@3024B-phi

Inline spilling %vreg121 [1764r,2948r:0)[2960B,2968r:0) 0@1764r

From original %vreg111

Tracing value %vreg121:0@1764r

%vreg121:0@1764r: copy of %vreg122:6@1520r kill=1

%vreg122:6@1520r: copy of %vreg122:5@1472B kill=1

%vreg122:5@1472B: split phi value, checking 2 phi-defs, and 3 non-phi/orig defs // WRONG: the phi value was part of %vreg111 interval, see above.

The non-phi/defs to check are the sibling COPYs after the inner loops from %vreg121 to %vreg122 (@2948, @2968), and the identity COPY from valno 5 to 6 @1520 for %vreg122.

Vreg122:5 @1472 is the beginning of a basic block. It was there all along, first in %vreg111, and then in %vreg122. %vreg111 gets split two blocks further down @1764.

Outer loop

{

BB2:

BB15:

BB3: //joining BB2 and BB15

@1472 // PHI value that should have been detected as part of OrigLI.

@1520 identity COPY 122 to 122

BB4:

BB5:

@1764 // This is where vreg111 is split and tracing of vreg121 begins. OrigVNI passed is @1520.

// Exits with two edges into two one-block loops, with one exit block each, which are joined in BB12.

// Succs BB6, BB8

BB6:

{} // Single block nested loop

BB7: // pred BB6, succ BB12

@2948 sibling COPY 121 to 122

BB8: // pred BB5

{} // Single block nested loop

BB9: // pred BB8, succ BB12

@2968 sibling COPY 121 to 122

BB12: // joining BB7 and BB9

phi 9

BB13:

// Latch block for outer loop, has edge from BB#4

phi 10

}

Is there anything else I can provide that might you?

/Jonas

Hi Jonas,

Thanks for the additional inputs.

I’ll have a look (trying today but no promises) and I'll get back to you.

Cheers,
Quentin

Hi Jonas,

Thanks for your patience.

After spending some time looking at the additional output you gave me, I agree that your fix is the right one.
I was worried that this problem may arise because we were spilling not real user, but in fact what I thought was the problem is an optimization we could do :).

See my comments inlined for a few nitpicks before you commit.

Thanks again,
-Quentin

Hi Jonas,

Thanks for the additional inputs.

I’ll have a look (trying today but no promises) and I'll get back to you.

Cheers,
Quentin

Hi Quentin,

I have rerun the test case on a recent commit, so the numbers have changed. There are also now a few more basic blocks very small basic blocks in the function, and therefore there are some slight differences. I tried to go back to earlier commits, without success for some reason… This is however very similar, except that there becomes two COPYs back to sibling value after the loop. My apologies [vregs 76->111, 87->122].

1.
The interval for %vreg111 first covers nearly the entire function. Then it gets split into two intervals, where one covers the inner loops, which makes sense.
selectOrSplit %vreg111 [68r,400B:1)[400B,688r:6)[688r,752B:4)[752B,1264r:6)[1264r,1312r:3)[1312r,1472B:2)[1472B,1520r:5)[1520r,3488B:0) 0@1520r 1@68r 2@1312r 3@1264r 4@688r 5@1472B-phi 6@400B-phi w=3.181050e-02

queuing new interval: %vreg121 [1764r,2936r:0)[2960B,2980r:0) 0@1764r
queuing new interval: %vreg122 [68r,400B:0)[400B,688r:1)[688r,752B:2)[752B,1264r:1)[1264r,1312r:3)[1312r,1472B:4)[1472B,1520r:5)[1520r,1764r:6)[2936r,2960B:7)[2980r,2988B:8)[2988B,3024B:9)[3024B,3488B:10) 0@68r 1@400B-phi 2@688r 3@1264r 4@1312r 5@1472B-phi 6@1520r 7@2936r 8@2980r 9@2988B-phi 10@3024B-phi

2.
Inline spilling %vreg121 [1764r,2948r:0)[2960B,2968r:0) 0@1764r
From original %vreg111
Tracing value %vreg121:0@1764r
  %vreg121:0@1764r: copy of %vreg122:6@1520r kill=1
  %vreg122:6@1520r: copy of %vreg122:5@1472B kill=1
  %vreg122:5@1472B: split phi value, checking 2 phi-defs, and 3 non-phi/orig defs // WRONG: the phi value was part of %vreg111 interval, see above.

The non-phi/defs to check are the sibling COPYs after the inner loops from %vreg121 to %vreg122 (@2948, @2968), and the identity COPY from valno 5 to 6 @1520 for %vreg122.

3.
Vreg122:5 @1472 is the beginning of a basic block. It was there all along, first in %vreg111, and then in %vreg122. %vreg111 gets split two blocks further down @1764.

Outer loop
{

BB2:

BB15:

BB3: //joining BB2 and BB15
   @1472 // PHI value that should have been detected as part of OrigLI.
   @1520 identity COPY 122 to 122

BB4:
  …
BB5:
  @1764 // This is where vreg111 is split and tracing of vreg121 begins. OrigVNI passed is @1520.

// Exits with two edges into two one-block loops, with one exit block each, which are joined in BB12.
  // Succs BB6, BB8

BB6:
  {} // Single block nested loop
BB7: // pred BB6, succ BB12
@2948 sibling COPY 121 to 122

BB8: // pred BB5
  {} // Single block nested loop

BB9: // pred BB8, succ BB12
@2968 sibling COPY 121 to 122

BB12: // joining BB7 and BB9
phi 9

BB13:
  // Latch block for outer loop, has edge from BB#4
  phi 10
}

Is there anything else I can provide that might you?

/Jonas

From: Quentin Colombet [mailto:qcolombet@apple.com]
Sent: den 2 december 2014 01:15
To: Jonas Paulsson
Cc: llvmdev@cs.uiuc.edu; stoklund@2pi.dk; Patrik Hägglund H
Subject: Re: [LLVMdev] InlineSpiller.cpp bug?

Hi Jonas,

Thanks for your patience.

I have looked into the problem you reported and although the fix you proposed seem correct, I am not sure yet this is the way to go. I would need the debug output of the regalloc to help you further, but here are my initial findings.

traceSiblingValue looked into the problematic phi because the previous iteration said that %vreg87:6 and %vreg87:5 are sibling.
Since they are sibling it means they share the same original vreg.
The problem is that if vreg87:5, i.e., the original phi, is not the original value then it must have been inserted by splitting.
Based on what you said this is not the case, so we end up in this weird situation.

If you cannot share the debug output, you could check the following for me:
1. how vreg76 gets split?
2. what is the second non-phi/defs for vreg87:5?
3. if vreg87:5 is not inserted by splitting, how does it gets here?

Thanks,
-Quentin

Hi Quentin,

I have tried to find a test case for an official target, but failed. It seems to be a rare case.

To do it, I added the ‘else’ clause in the following:


if (VNI->def == OrigVNI->def) {
   DEBUG(dbgs() << "orig phi value\n");
   SVI->second.DefByOrigPHI = true;
   SVI->second.AllDefsAreReloads = false;
   propagateSiblingValue(SVI);
   continue;
}
// check if the valno is actually an orig PHI, but is not OrigVNI
else {
  LiveInterval &OrigLI = LIS.getInterval(Original);
  VNInfo *OrigVNI_curr = OrigLI.getVNInfoAt(VNI->def);
  if (OrigVNI_curr->def == VNI->def)
    assert(0 && "OrigLI contained VNI which was a PHI, but not OrigVNI!");
}

, but the assert never triggered anywhere else than in my original case.

When I did the following, at least my test case worked correctly:

LiveInterval &OrigLI = LIS.getInterval(Original);

Hoist getInterval outside of the loop.

VNInfo *OrigVNI_curr = OrigLI.getVNInfoAt(VNI->def);

Change the variable name to match LLVM naming convention (i.e., do not use ‘_’).
Add a comment on why we are doing that. E.g., something along the lines: Make sure the VNInfo for Original cover the current definition to properly determine whether or not this phi was added by splitting.

Hi Quentin,

  1. I moved out the LIS.getInterval() call outside the loop, and removed the same call further down in the loop.

  2. To me it looks like there only needs to be the check against the looked-up VNInfo for the current definition, so I simply replaced the check against OrigVNI. Does this sound good?

  3. OrigVNI is still passed as argument, as it is used later in the method. Or could those uses be replaced to use the same VNInfo that was looked up in the if statement regarding original phi-values? In that case the OrigVNI argument could be removed.

I leave that for you to decide.

Any more suggestions before I commit?

diff --git a/lib/CodeGen/InlineSpiller.cpp b/lib/CodeGen/InlineSpiller.cpp

index 6a6e15d…34e796c 100644

— a/lib/CodeGen/InlineSpiller.cpp

+++ b/lib/CodeGen/InlineSpiller.cpp

@@ -508,6 +508,7 @@ MachineInstr *InlineSpiller::traceSiblingValue(unsigned UseReg, VNInfo *UseVNI,

SmallVector<std::pair<unsigned, VNInfo*>, 8> WorkList;

WorkList.push_back(std::make_pair(UseReg, UseVNI));

  • LiveInterval &OrigLI = LIS.getInterval(Original);

do {

unsigned Reg;

VNInfo *VNI;

@@ -521,8 +522,11 @@ MachineInstr *InlineSpiller::traceSiblingValue(unsigned UseReg, VNInfo *UseVNI,

// Trace through PHI-defs created by live range splitting.

if (VNI->isPHIDef()) {

  • // Stop at original PHIs. We don’t know the value at the predecessors.

  • if (VNI->def == OrigVNI->def) {

  • // Stop at original PHIs. We don’t know the value at the

  • // predecessors. Look up the VNInfo for the current definition

  • // in OrigLI, to properly determine whether or not this phi was

  • // added by splitting.

  • if (VNI->def == OrigLI.getVNInfoAt(VNI->def)->def) {

DEBUG(dbgs() << “orig phi value\n”);

SVI->second.DefByOrigPHI = true;

SVI->second.AllDefsAreReloads = false;

@@ -542,7 +546,6 @@ MachineInstr *InlineSpiller::traceSiblingValue(unsigned UseReg, VNInfo *UseVNI,

// Separate all values dominated by OrigVNI into PHIs and non-PHIs.

SmallVector<VNInfo*, 8> PHIs, NonPHIs;

LiveInterval &LI = LIS.getInterval(Reg);

  • LiveInterval &OrigLI = LIS.getInterval(Original);

for (LiveInterval::vni_iterator VI = LI.vni_begin(), VE = LI.vni_end();

/ Jonas

Hi Jonas,

Hi Quentin,

1. I moved out the LIS.getInterval() call outside the loop, and removed the same call further down in the loop.

Thanks.

2. To me it looks like there only needs to be the check against the looked-up VNInfo for the current definition, so I simply replaced the check against OrigVNI. Does this sound good?

Yes.

3. OrigVNI is still passed as argument, as it is used later in the method. Or could those uses be replaced to use the same VNInfo that was looked up in the if statement regarding original phi-values? In that case the OrigVNI argument could be removed.
   I leave that for you to decide.

No, we should leave as it is.

Any more suggestions before I commit?

Go ahead and commit.
One thing you could try if to produce a reduced test case by using empty inline asm with clobbered registers to trick the register pressure. I admit it may complicate, so give if you think you could produce such a test case, ignore this.
Note: This should not block the commit of this fix :).

Thanks,
-Quentin