Spilling multi-word virtual registers

Does anybody have any tips for generating spills/reloads for large
non-vector registers?

I'm working on a back end for a DSP architecture that has accumulator
registers that are too large to be spilled or reloaded with a single
instruction. All of their bits can be accessed in word-size chunks via
three sub-registers (low, high, and ext). So loading or storing one
requires three instructions: one for each sub-register.

For quite a while now, my implementation of loadRegFromStackSlot() and
storeRegToStackSlot() has assumed that it would only receive physical
registers, which makes it fairly straight-forward. They generate three
memory instructions, calling TargetRegisterInfo::getSubReg() to get the
sub-register operand for each of them.

So it was a rude awakening when a test program resulted in a _virtual_
register being passed into loadRegFromStackSlot() (via
LiveIntervals::tryFoldMemoryOperand() if it matters). Obviously I need
to make some changes. But what?

A couple options immediately come to mind:

1. Generate INSERT_SUBREG/EXTRACT_SUBREG machine instructions in
loadRegFromStackSlot() and storeRegToStackSlot() to handle virtual
registers. Will this work? Is it safe to create additional virtual
register from these methods?

2. Emit a single pseudo-instruction for large loads and stores and use a
custom pass to expand it to multiple instructions after register
allocation.

Any other suggestions?

-Ken

This is quite simple to handle. A register MachineOperand has a subreg field for this purpose. It is used to pick out subregisters of a virtual register.

For a physical register:

  MO.setReg(TRI.getSubReg(Reg, SubIdx));

For a virtual register:

  MO.setReg(Reg);
  MO.setSubReg(SubIdx);

If you are using BuildMI, the subreg is passed as the third argument to addReg().

The register allocator (rewriter to be exact) will clear the subreg field when substituting the allocated physical register.

Note that a physical register operand may not have a subreg. It must be 0.

/jakob

On Tuesday, July 20, 2010 4:04 PM, Jakob Stoklund Olesen

> Does anybody have any tips for generating spills/reloads for large
> non-vector registers?
> [snip]

This is quite simple to handle. A register
MachineOperand has a subreg field for this
purpose. It is used to pick out subregisters
of a virtual register.

Thanks, Jakob. That indeed was a simple fix.

The register allocator (rewriter to be exact)
will clear the subreg field when substituting
the allocated physical register.

Speaking of the rewriter, I've had some problems recently where the
rewriter replaces the last of the three load instructions with a COPY
instruction because isLoadFromStackSlot() returns the same frame index
for all three load. For example,

  load a.l, <fi#n>, 0 load a.l, <fi#n>, 0
  load a.h <fi#n>, 1 ===> load a.h, <fi#n>, 1
  load a.e <fi#n>, 3 move a.e, a.l

I quickly hacked around the problem by returning a frame index only for
the loads of the low sub-register (returning 0 for the rest), but I'm
sure this isn't the best solution. Is there a simple way to avoid the
replacement while still reporting the actual frame index for all of the
load instructions?

-Ken

Yeah, the target hooks are not really prepared for dealing with subregisters, and the rewriter doesn't really expect multiple instructions to be inserted by the hooks.

To be safe, you should probably only return true from isLoadFromStackSlot when the instruction loads the whole stack slot. That is, the offset is zero, and the stack slot size matches the register size.

If you need the rewriter to be able to undo a stack slot load/store, you will have to create pseudo-instructions for accumulator loads and stores. Compare the ARMExpandPseudoInsts pass.

We are working towards a design where we don't need these rewriter shenanigans. In fact, the trivial rewriter will be used instead.

This seems to result in dead frame indices being passed to
eliminateFrameIndex(). I'm currently handling these by removing the
load/store instructions in which they appear. I haven't found any errors
in the code that gets generated so far, but I also notice that none of
the other back ends seem to have any special logic to handle dead slots
in eliminateFrameIndex(). Should I be concerned?

-Ken

I am not sure. Why are there instructions referencing dead spill slots?

/jakob