Spilling predicate registers

s/llvm-commits/llvmdev/

As Jakob pointed out to me, the core problem is that the current
register scavenger implementation will only give you one register; for
the PowerPC case, and it looks like for your case as well, we might
really need two registers. In the short term, a reasonable solution
might be to modify the register scavenger to enable it to return
multiple registers. This seems to be related to the number of emergency
spill slots the scavenger reserves (because it "cannot fail"), so there
might be some downside to this. We would have to add some
target-dependent callback to tell the register scavenger how many slots
to reserve (the default would be 1). Do you think that this is worth
doing?

I started modifying the backend to support 2 scavenged registers only to
end up realizing half way through that in our case we don't need two.

On Hexagon we can get around the requirement for the second register (the
first for storing the predicate, second for a large offset that does not
fit into the spill instruction) by using a constant extended value. A
constant extend value is a value encoded in the instruction stream, taking
up one instruction slot.

{
r10 = memd(fp+##LARGEOFFSET)
}
For us this is the better solution as we want as little as possible spill
code in hot regions.

Neat ISA!

To support 2 scavenged registers, as you already said, we would need:
* target hook to indicate the maximum number of simultanously live
scavenged registers needed
* support in RegScavenger for more than one registers: instead of state
for one register (ScavangedReg, ScavengedRC, ScavengedRestore) we need
need an array of such states that are handled correctly
* PEI::scavengeFrameVirtualRegs must be adopted to handle more than one
register
* PEI::calculateFrameObjectOffsets needs to handle the reserved scavenger
spill slot frame indexes

I am not sure extending the scavenger is the right way to go about this.

There are two different situations where we might need extra registers to spill something:

1. When spilling a weird register class like predicate registers, we already known during register allocation that we will need a scratch GPR to assist with the spill.

2. When spilling to a stack slot that may be out of reach of the offset encoding.

The scavenger is really meant to handle the second case, although there is nothing wrong with using it for the first case as well.

However, in the first case we know immediately that a scratch register is necessary, so why not just ask the register allocator for one? Basically, I think storeRegToStackSlot should be allowed to call MRI->createVirtualRegister() when it needs a scratch register.

That doesn't work today because the register allocators don't expect it. I don't see any fundamental problems preventing it, though.

We would need to make sure that all 4 register allocators can handle it. RAFast is the most difficult to fix.

/jakob

I am not sure extending the scavenger is the right way to go about this.

There are two different situations where we might need extra registers to
spill something:

1. When spilling a weird register class like predicate registers, we
already known during register allocation that we will need a scratch GPR
to assist with the spill.

2. When spilling to a stack slot that may be out of reach of the offset
encoding.

The scavenger is really meant to handle the second case, although there is
nothing wrong with using it for the first case as well.

However, in the first case we know immediately that a scratch register is
necessary, so why not just ask the register allocator for one? Basically,
I think storeRegToStackSlot should be allowed to call
MRI->createVirtualRegister() when it needs a scratch register.

That was an alternative I was looking at (extending the scavenger seemed
the quicker fix). The greedy allocator would just add the new live range
to the queue? I have not looked at the implications for fast.

We would definitely benefit from support like this in the allocator (who
can make better decisions than if the scavenger ends up spilling
something)

- Arnold

I am not sure extending the scavenger is the right way to go about this.

There are two different situations where we might need extra registers to
spill something:

1. When spilling a weird register class like predicate registers, we
already known during register allocation that we will need a scratch GPR
to assist with the spill.

2. When spilling to a stack slot that may be out of reach of the offset
encoding.

The scavenger is really meant to handle the second case, although there is
nothing wrong with using it for the first case as well.

However, in the first case we know immediately that a scratch register is
necessary, so why not just ask the register allocator for one? Basically,
I think storeRegToStackSlot should be allowed to call
MRI->createVirtualRegister() when it needs a scratch register.

That was an alternative I was looking at (extending the scavenger seemed
the quicker fix). The greedy allocator would just add the new live range
to the queue?

More or less. Handling this in LiveRangeEdit would give you basic, greedy, and pbqp in one go. It would need to:

- Notice that a new virtual register has been created.
- Compute the live range from scratch.
- Add the live range to the queue.

These allocators already understand that spilling can create new live ranges, and LRE keeps track of them. This could actually simplify the spiller as well, since it wouldn't have to manually compute live ranges.

LiveRangeCalc::calculate() can compute a live range from scratch. This function just needs to be implemented.

I have not looked at the implications for fast.

Thinking about it, I don't think this will be easy, perhaps not worthwhile.

When compiling at -O0, you can just reserve an extra register. There won't be significant register pressure anyway.

/jakob