conceiving of a frontend feature which requires LLVM support: builtin to determine stack size upper bound

Here is a feature I would like to introduce to my frontend:

A builtin function to calculate the upper bound of the stack size of any given function.

The idea here is that, if we do not have any:

  • external library calls
  • alloca
  • indirect function calls
  • non-tail-recursion
  • inline assembly that modifies the stack pointer

Then, we can look at a given function and determine precisely how much stack space is needed to spawn a thread to run that function. This provides two benefits:

  • Statically guarantee that stack overflow will not occur
  • Threads can have smaller stack sizes, lowering the cost of creating a thread

This feature requires tight coupling with LLVM, because:

  • We need to know the stack size post-optimizations
  • Frontend needs to make a compile error if non-tail-recursion occurs, but LLVM might introduce (or remove?) tail recursion, and so we would need to capture this error from LLVM and report it back in the frontend.

What do you think? Is this feature reasonable to achieve with LLVM and in scope of the LLVM project?

I would propose an LLVM builtin function to perform this calculation, and I can work on a proof-of-concept if llvm-dev thinks this idea is worth pursuing.

Upstream issue: https://github.com/zig-lang/zig/issues/157

[+CC John Regehr]

Hi Andrew,

Good to know that someone is working on a feature like this. Many languages - even those used for embedded applications - are happily ignoring the issue of an overflowing stack.

Some comments inline...

Here is a feature I would like to introduce to my frontend:

A builtin function to calculate the upper bound of the stack size of any
given function.

The idea here is that, if we do not have any:
* external library calls

As was discussed in the upstream issue, this can be solved by providing the upper stack size bound for the external function. LLVM's function attributes can be used for this.

* alloca

I was going to write that this shouldn't be problem. But of course there are variably-sized allocas, which are hard to support.

* indirect function calls

These can be supported by passing the upper bound along with the function pointer.

* non-tail-recursion

In some cases, a tight upper bound can be calculated even for recursive functions, although this is non-trivial and impossible to do for the general case.

* inline assembly that modifies the stack pointer

Then, we can look at a given function and determine precisely how much
stack space is needed to spawn a thread to run that function. This provides
two benefits:
* Statically guarantee that stack overflow will not occur
* Threads can have smaller stack sizes, lowering the cost of creating a
thread

This feature requires tight coupling with LLVM, because:
* We need to know the stack size post-optimizations
* Frontend needs to make a compile error if non-tail-recursion occurs, but
LLVM might introduce (or remove?) tail recursion, and so we would need to
capture this error from LLVM and report it back in the frontend.

What do you think? Is this feature reasonable to achieve with LLVM and in
scope of the LLVM project?

Writing this analysis on the MIR level shouldn't be too difficult. The basic pieces (for a single function) are already there. Search LLVM's source code for "FrameLowering" and "MachineFrameInfo". Your analysis would have to scan the machine code for function calls to find the worst case call sequence.

The machine code can contain calls to additional functions, like memcpy and functions from compiler-rt, whose stack usage information has to be provided externally.

For embedded applications you have to keep in mind interrupts. A possible solution is described in this paper: https://www.cs.utah.edu/~regehr/papers/p751-regehr.pdf

I would propose an LLVM builtin function to perform this calculation, and I
can work on a proof-of-concept if llvm-dev thinks this idea is worth
pursuing.

What do you mean by "LLVM builtin function"?

-Manuel

Would a builtin function be the best way to expose this information? Maybe generating a side table would be best? As I recall, we already produce some stack-size information as part of our StackMap table (see ). Another option could be some special form of prefix/prologue data ( / ). The “@stack_size(function)” you’ve discussed on the zig issue may not be implementable in general (because we’d need some way of making sure that we always generate code for “@foo” before any other function containing a call to “@stack_size(@foo)”, and that’s probably not possible in general). -Hal