Stack alignment problem

Hi,
I have a problem getting a properly aligned stack for my LLVM backend. I've
asked about this previously, but unfortunately only now could try the
suggested solution.

For reference, here's the original message from me:
http://mail.cs.uiuc.edu/pipermail/llvmdev/2004-July/001388.html

And here's reply from Chris:
http://mail.cs.uiuc.edu/pipermail/llvmdev/2004-July/001390.html

The PrologEpilogInserter.cpp file aligns the stack only if
MachineFrameInfo::hasCalls returns true, which happens only if the function
has "call frame setup instruction" which my backend does not generate.
Chris suggested adding explicit MachineFrameInfo::setHasCalls call, which I've
tried, but it does not help. The PrologEpilogInsert.cpp file always
explicitly calls setHasCalls with the value it computes, so whatever value I
set manually is just overwritten.

So, what's the right approach? As usual, I'll be willing to code a patch.

TIA,
Volodya

Can you explain the problem in more detail? Specifically the LLVM code gneerator assumes that there is some alignment that the stack is required to have as part of its ABI. For example, in X86 target machine, the stack is 8-byte aligned on entry to function calls.

What this means is that the frame info can assume that the stack pointer is 8-byte aligned on entry to every function, but that it has to preserve this alignment for any functions that call another function (oh and it also has to remember that the return address gets pushed as well).

This is why frame info works the way it does: it assumes that there is no reason to keep the outgoing stack pointer aligned unless there is a call.

How is your target different here? Can you give an example of why this causes a problem?

-Chris

Here's the code which computes the hasCalls flag:

   bool HasCalls = false;

  for (MachineFunction::iterator BB = Fn.begin(), E = Fn.end(); BB != E; ++BB)
    for (MachineBasicBlock::iterator I = BB->begin(); I != BB->end(); )
      if (I->getOpcode() == FrameSetupOpcode ||
          I->getOpcode() == FrameDestroyOpcode) {
        .........
               HasCalls = true;
          ........

So, stack is aligned only if there is instruction with FrameSetupOpcode or
FrameDestroyOpcode. In X86, it's defined as

   def ADJCALLSTACKDOWN : I<0, Pseudo, (ops), "#ADJCALLSTACKDOWN">;
   def ADJCALLSTACKUP : I<0, Pseudo, (ops), "#ADJCALLSTACKUP">;

And I'm not quite sure why I need to define and insert something similar. Why
can't the above code just check for call instructions, as opposed to
"FrameSetup" instructions?

- Volodya

yOn Wed, 23 Mar 2005, Vladimir Prus wrote:

How is your target different here? Can you give an example of why this
causes a problem?

Here's the code which computes the hasCalls flag:

  bool HasCalls = false;

for (MachineFunction::iterator BB = Fn.begin(), E = Fn.end(); BB != E; ++BB)
   for (MachineBasicBlock::iterator I = BB->begin(); I != BB->end(); )
     if (I->getOpcode() == FrameSetupOpcode ||
         I->getOpcode() == FrameDestroyOpcode) {
       .........
              HasCalls = true;
         ........

So, stack is aligned only if there is instruction with FrameSetupOpcode or
FrameDestroyOpcode. In X86, it's defined as

  def ADJCALLSTACKDOWN : I<0, Pseudo, (ops), "#ADJCALLSTACKDOWN">;
  def ADJCALLSTACKUP : I<0, Pseudo, (ops), "#ADJCALLSTACKUP">;

And I'm not quite sure why I need to define and insert something similar. Why
can't the above code just check for call instructions, as opposed to
"FrameSetup" instructions?

Every target (not just x86) uses these. Basically they are pseudo instructions that mark the begin/end of the call sequence. On targets that support frame pointer elimination, they expand different ways (e.g. into a noop if fp-eliminating, or into [e.g.] "sub ESP, 40" if not).

I think it's a good idea for your target to do similar things, for consistency if nothing else, but if you really want to, I wouldn't be opposed to making that also check for instructions that isCall.

-Chris

I though about it, and decided that modifying the above code is too much work.
Especially given that the most reasonable idea to me would be to check only
for call instructions, and that would mean modifying all targets.
So, I've added those pseudoinstructions and it seems to work.

- Volodya