Stack alignment on X86 AVX seems incorrect

Two responses inline…

My current thinking is that an emergency spill slot could be set aside to
hold the original, ABI conforming, frame pointer. Not an ideal solution,
but in my situation where I must cover any code a user throws at me,
breaking the ABI and playing with the stack is preferred.

Ah, this is not a good idea. I examined this a while back. The issue is that spilling the base pointer causes two levels of indirection to access arguments.

Figure 3.3 on page 16 of is not
normative. See foot note 7 in the same page. Figure 3.4 on page 21
confirms that the use of a frame-pointer is optional.

So, if one doesn’t use ENTER in the prologue and uses RSP to access local
variables, RBP may be used as a calee-saved GPR.

I am not sure if I am completely following. The issue that required
aligning the frame to 32 bytes is when there are variable sized objects on
the stack (e.g. alloca). In that case, the RBP frame pointer is required to
access the spill slots. If I’m not mistaken, calculating the address of
spill slots off of RSP would be costly in this case.

No, stack realignment needs to happen if there are auto variables on the
stack of types that need a larger alignment than the default. This
currently means AVX vectors for x86-64 and SSE/AVX vectors for x86-32
folloing the original sysv ABI. In that case %rbp/%ebp is used to
reference the original arguments on the stack and %rsp/%esp is used to
reference the auto variables.

That sounds about right; my mistake. When I realign the frame in the presence of variable sized objects and AVX spills, I have three pointers sitting around: the real, unaligned frame pointer (let’s say RBX and used as the ‘base pointer’); the aligned frame pointer (RBP); and the stack pointer (RSP). The arguments are based off of the unaligned frame pointer. Besides the change to make RBX the base pointer in the Emit[Prologue|Epilogue] routines, everything else stayed the same.