Patch - big stackframes on SPU

Hello all,

currently the SPU backend does not handle big stack frames (>16*511
bytes) nicely. llc asserts on malformed machine instructions.
(Assertion `MI->getOperand(OpNo).isImm() && "printDFormAddr first
operand is not immediate")

E.g. the function:
define i32 @foo() nounwind {
entry:
    %retval = alloca i32
    %big_data = alloca [1000 x i32]
    store i32 3840, i32* %retval, align 4
    br label %return

return:
    %retval2 = load i32* %retval
    ret i32 %retval2
}
demonstrates this issue.

Attached is a patch to fix this. It
-fixes a few small errors in function prologue and epilogue insertion
-enables register scavenging for frame index elimination
and
- implements the frame index elimination for big stacks.

Patch is made against latest svn head.

best regards,
Kalle Raiskila

spu_stackframes.patch (5.21 KB)

Hello all,

currently the SPU backend does not handle big stack frames (>16*511
bytes) nicely. llc asserts on malformed machine instructions.
(Assertion `MI->getOperand(OpNo).isImm() && "printDFormAddr first
operand is not immediate")

Sounds fine to me in general. Please write a testcase for this though. Also, this patch causes the CodeGen/CellSPU/call.ll regression test to fail. Please investigate and send an updated patch (with a testcase), thanks!

-Chris

Chris Lattner skrev:

currently the SPU backend does not handle big stack frames (>16*511
bytes) nicely. llc asserts on malformed machine instructions. (Assertion `MI->getOperand(OpNo).isImm() && "printDFormAddr first operand is not immediate")

Sounds fine to me in general. Please write a testcase for this
though.

Attached.

Also, this patch causes the CodeGen/CellSPU/call.ll
regression test to fail.

Oops. Sorry about that. I promise to inspect output of 'make check' more thoroughly in the future.
This is now discussed in this thread:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-February/029706.html

But the CodeGen/CellSPU/call.ll is not the only one that fails. Three other tests from CodeGen/CellSPU fail also. The cause of these is the enabling of the register scavenger. It allocates the emergency spill slot from the stack, and in doing so function prologue and epilogue gets injected into some functions that do not use stack at all and thus wouldn't have a prologue or epilogue in the first place. (Seems that all test case failures are due to this - not bad code or asserts). But inserting redundant code ofcourse bloats the generated code...

Would it be possible to conditionally enable the register scavenger only if the function has a big stack? It now gets unconditionally enabled in SPURegisterInfo::requiresRegisterScavenging(const MachineFunction &MF).
Just checking MF.getFrameInfo()->getStackSize() here doesn't seem to be the solution...

kalle

bigstack.ll (471 Bytes)

Hello

Would it be possible to conditionally enable the register scavenger only if
the function has a big stack? It now gets unconditionally enabled in
SPURegisterInfo::requiresRegisterScavenging(const MachineFunction &MF).
Just checking MF.getFrameInfo()->getStackSize() here doesn't seem to be the
solution...

Well, I think no. regscavenger should work well regardless of any
settings. Currently it's heavily used for ARM, so, you might want to
look how the stuff is solved there.

Here's today's status:

Before I commit patches for big stacks, need to get the register scavenging assert patch committed. That means making the existing test cases work before the regscav patch gets committed. To do this means looking into why varargs doesn't cause a frame to be set up when prolog/epilog code is emitted (the spills to the stack happen just fine, but $sp is not adjusted.)

"The Big Flick": need to look at varargs handling and why a frame isn't being emitted.

-scooter

As Anton said, "It should just work."

-scooter