PBQP & CalcSpillWeights

Hi All,

I finally had a chance to get back to my pbqp trials, now using the 3.0
release. I still hit the same assert : "Attempting to spill already spilled
value."

This is triggered because in RegAllocPBQP::mapPBQPToRegAlloc, a vreg node is
either :
- a physical register (problem.isPRegOption(vreg, alloc)),
- or a spill (problem.isSpillOption(vreg, alloc))

The problem is that pass CalcSpillWeights can 'hint' that it is a poor
idea to spill this specific register with :

CalcSpillWeights.cpp / VirtRegAuxInfo::CalculateWeightAndHint :
  // Mark li as unspillable if all live ranges are tiny.
  if (li.isZeroLength(LIS.getSlotIndexes())) {
    li.markNotSpillable();
    ...

This hint makes the register non spillable at all for the spiller (that's the
assert above), not just a bad-idea-to-spill-but-feasible. The pbqp allocator
does not cope with this distinction and allways attempts to spill it.

I would need some guidance on how to modify the pbqp to handle this case
properly.

Best regards,

Hi Arnaud,

LiveInterval::markNotSpillable() sets the live interval's spill weight
to infinity. For well-formed PBQP graphs (i.e. ones that have some
finite-cost solution), PBQP should never chose to spill such an
interval. The two possibilities for this crash are that the input
graph has no finite-cost solution, or that you've exposed a bug in the
PBQP solver.

From memory your target is not public, so I won't be able to reproduce

the crash myself. Is that correct?

If that's the case, I could add functionality to dump the PBQP graphs
during allocation. I think they should give me enough information to
debug the issue. Would you be able to share the PBQP graphs?

Cheers,
Lang.

Hi Lang,

From memory your target is not public, so I won't be able to reproduce
the crash myself. Is that correct?

Correct.

If that's the case, I could add functionality to dump the PBQP graphs
during allocation. I think they should give me enough information to
debug the issue. Would you be able to share the PBQP graphs?

I can share the pbqp graph if you send me a patch with the added functionality
you want. On my side, I already checked the node cost for this register, which
is correctly set to inf for spill, as well as the edge costs, and they look
ok.

I also attached my target's pbqp related file in case you want to double check
what I did. This is llvm-3.0 based. It comprises 2 passes : the
FemtoPBQPBuilder, plus a FemtoPairablepass, which undo some of the coalescer's
work. The insertRegCopy may specifically need a double check, as I am not 100%
sure to have updated correctly the LiveInterval information.

In terms of registers, the Femto target is simplistic : a single register
class GR16, for data and pointers, all i16. It has 16 registers, R0 to R15;
R15 is used as stack pointer, and R14 potentialy as framepointer. A pair is
constituted from a register + its successor, i.e. (R0, R1), (R1,R2), (R2, R3),
... are valid pairs. This is an instruction encoding constraint, as we only
have 16bits wide instructions. Pairs involving R15 are never allowed, those
with R14 may be allowed, depending on each function.

Most instructions have no pairing constraints, and do not appear here, being
handled by the PBQPBuilder default base class. A few instructions have 1 or 2
pairs of registers, and are all handled by the 2 passes above.

Thanks for your help,

FemtoRegisterAllocator.cpp (17.2 KB)

Hi Arnaud,

Thanks for attaching those files. I'll take a look at them.

Commit r153483 adds an option to the PBQP allocator,
"-pbqp-dump-graphs", to dump the PBQP graph for each round of each
function in a compilation unit. The generated files are named "<module

.<function>.<round>.pbqpgraph", and contain a simple text

representation of the PBQP graph. The PBQP allocator has been stable
for some time - I think this patch should apply cleanly to the 3.0
version.

Can you run the failing test case through the allocator with this
patch applied, passing the "-regalloc=pbqp -pbqp-dump-graphs
-debug-only=regalloc" options. I'll need to take a look at the last
graph dumped before the assertion is triggered.

Cheers,
Lang.

Hi Lang,

I have reduced the testcase as much as possible. The log of the run and the
dumped graphes are attached.

Cheers,

invalid-spill.tar.gz (6.82 KB)

Hi Arnaud,

Apologies for the delayed reply.

Thank you for the excellent test case - it exposed a subtle bug in the colorability heuristic. This has been fixed in r153958.

In case you are curious, the bug was as follows: the PBQP solver applies applies a simplification step to each matrix. When all elements of a matrix row or column are equal, the value for those elements is “pushed out” to the corresponding element of the cost vector, and the row/column is zeroed out. This is an attempt to turn the matrices into zero matrices which can be eliminated from the problem. This simplification step runs even for rows/columns that are all infinite. When an all infinite row/column is encountered, it will be zeroed out, and the infinite cost attached to some register in the cost vector. Unfortunately the infinite-cost elements of vectors were not being taken into consideration when determining colorability, only the infinities in the matrices were. In your test case this led to a node being falsely assumed to be colorable when it was not, and pushed to the coloring stack too early. By the time it was encountered in the coloring phase it had already been over-constrained, and no finite cost solutions were available.

In future I hope to make the simplification step strip infinite rows/columns from the problem altogether (along with their corresponding vector elements). This would speed the solver up further by avoiding consideration of such impossible options.

With the fix from r153958 applied the solver now finds a zero-cost solution for the test case you sent me. This should translate to a valid register allocation for your test case. Please try it out and let me know if it works for you.

Cheers,
Lang.

Hi Lang,

Thanks a lot for taking time to look into this. I will test the fix soon and
let you know the results.

Cheers,

Hi Lang,

The assert is not triggered any longer on my testcases :slight_smile:

I however still get my wrong allocation in some non trivial cases : the
pairing constraint is not fulfilled.

I have tried to modify the 'ensure pairable' pass (the pass undoing some
of the coalescer's work) to always insert register copies for
instructions with the pairable constraint, instead of being smart and
inserting the copy only when needed. This had no visible effect.
Although I am deriving from PBQPBuilder, the PBQP seems to be coalescing
some register copy, without taking into account that the source or dest
reg may have different constraints. In which part of pbqp would this be
happening ?

Cheers,

Hi Arnaud,

I’m glad to hear that your test case is working.

I however still get my wrong allocation in some non trivial cases : the
pairing constraint is not fulfilled.

I have tried to modify the ‘ensure pairable’ pass (the pass undoing some
of the coalescer’s work) to always insert register copies for
instructions with the pairable constraint, instead of being smart and
inserting the copy only when needed. This had no visible effect.
Although I am deriving from PBQPBuilder, the PBQP seems to be coalescing
some register copy, without taking into account that the source or dest
reg may have different constraints. In which part of pbqp would this be
happening ?

If you’re deriving PBQPBuilder (and not PBQPBuilderWithCoalescing) then PBQP won’t attempt any coalescing, though obviously it can “get lucky” and assign registers that are connected by a copy the same register.

Those copies shouldn’t be necessary - as long as you don’t have a circular chain of paired variables (a,b,c,…,a) with all-infinite spill costs you should be safe.

Can you send me a PBQP graph for your failing test case? I’ll see if I can figure out where the solver is going wrong.

  • Lang.

Hi Lang,

Thanks for proposing to look into this testcase.

Regarding the copies, this was done to be on the safe side. I would normally only insert copies where needed in order to express the pairable constraint. For example, in a case like ‘instr %vreg1, %vreg2, %vreg1’, I need to insert a %vreg1 copy in order to express the pairing constraint between the first and third arguments of instr.

Cheers,

test4.tar.gz (18.5 KB)