[RFC] Stack overflow and optimizations

During the review of https://reviews.llvm.org/D59978 we got to the
question whether we can optimize based on the assumption that stack
overflow is undefined behavior. While I think it is according to the
C++ standard, Windows Structured Exception Handling and signal
handlers might allow programs to recover/exit gracefully.

Concretely, the patch D59978 wants add the noreturn attribute to
functions that recurse infinitly. Also, a function that
unconditionally calls a function with a noreturn attribute can be
deduced to be noreturn.

Consider the following program:

#include <cstdlib>
#include <cstdio>

void overflow(); // Secret function that triggers a stack overflow

int catchoverflow() {
  __try  {
    overflow();
    printf("Exception NOT caught\n");
    return 0;
  } __except(true) {
    printf("Exception caught\n");
    return 1;
  }
  return 2;
}

int main() {
  auto result = catchoverflow();
  printf("Done execution result=%i\n", result);
  return EXIT_SUCCESS;
}

An implementation of the overflow function could be:

 void overflow() {
  overflow();
  printf("x");
}

Compiled with msvc or clang-cl, this prints:

Exception caught
Done execution result=1

That is, even though overflow() does not return normally, catchoverflow() does.

The documentation for the "noreturn" attribute currently says
(https://llvm.org/docs/LangRef.html#function-attributes):

This function attribute indicates that the function never returns normally. This produces undefined behavior at runtime if the function ever does dynamically return.

The motivation of marking catchoverflow() is that a program cannot
reasonably expect to continue normal execution after a stack overflow
happened, and shouldn't inhibit optimizations.

In this RFC, I would like to clarify the following questions:
1. Does the "noreturn" attribute imply that the function is not
returning via the unwind edge of an invoke instruction?
2. Can/should overflow() be marked noreturn?
3. Can/should catchoverflow() be marked noreturn?
4a. More generally, for the purpose of optimizations, can we assume
that functions do not overflow? That is, is stack overflow is
undefined behavior?
4b. If not undefined behavior, can we assume that if a stack overflow
occurs, the program will be terminated? This would e.g. stop
side-effect code to be moved before the overflowing call; otherwise it
would be executed on overflow termination. How would we check whether
a function can overflow the stack?

Whatever the answers to these questions are, I think we should clarify
what the function attributes noreturn, nounwind, willreturn mean. The
most explicit way would be listing all the possible outcomes of
calling a function and exclude possibilities per attribute.

1. Call returns to next instruction / invoke branches to "to" block.
2a. Call throws synchronous exception and stack is unwound to parent caller
2b. Invoke throws synchronous exception and branches to "unwind to" block
3a. Call throws asynchronous exception and stack is unwound to parent caller
3b. Invoke throws asynchronous exception and branches to "unwind to" block
4a. Call/invoke triggers thread/program termination and dtors/atexit
funcs are called (e.g. by calling exit())
4b. Call/invoke triggers thread/program immediate termination (e.g. by
calling abort())
5. Signal handler is executed and which kills the program
6. Immediate thread/program termination by external cause (e.g. by ,
`TerminateThread`, `kill -9`, kernel panic, power switch)

Michael

Two aspects that I forgot to mention in the RFC mail:

First: In some situations, we already assume that stack overflows are
unobservable behaviours. For instance, optimizations of stack
variables cause the stack frames to be smaller, such that the stack
overflow does not occur where it would have in an unoptimized program.
Also, we optimize the function

 void overflow() {
  overflow();
}

to

; Function Attrs: nounwind readnone sspstrong uwtable
define dso_local void @"?overflow@@YAXXZ"() local_unnamed_addr #0 {
entry:
  ret void
}

Using the assumption that a call overflow() does not return, we could
optimize overflow() as defined in

 void overflow() {
  overflow();
  printf("x");
}

to

 void overflow() {
  overflow();
}

i.e. to the same function (which does return). We currently do not
apply this, but optimizations based on noreturn might change this.

Second: Does the C++ "noexcept" specifier have an influence on whether
the compiler can assume that a function does not overflow its stack?

Michael

Hi Michael,

Please keep in mind non-C/C++ frontends. For example, in Rust, we promise to
avoid all undefined behavior in safe code. There is no reasonable compositional
analysis that can statically detect stack overflows (I know safety-critical
systems are subjected such analyses, but those could not reasonably be enforced
on all Rust code -- most of them just forbid recursion, for example), so we
defer to a dynamic check instead: we have a guard page for every thread, and we
make sure that for big stack frames being pushed, every page covered by that
frame is touched at least once (so that we will never skip the guard page).

If LLVM considers stack overflows as UB, the guard page is not enough. Dynamic
checks cannot guard against UB exploited by the compiler. So, in that case it
would become basically impossible to compile a safe language to LLVM, as far as
I can see.

Two aspects that I forgot to mention in the RFC mail:

First: In some situations, we already assume that stack overflows are
unobservable behaviours. For instance, optimizations of stack
variables cause the stack frames to be smaller, such that the stack
overflow does not occur where it would have in an unoptimized program.

That's fine. Basically, the abstract machine can be specified to
*non-deterministically* trigger a stack overflow any time the stack grows. So
essentially the running program has to be prepared that the stack might overflow
any time, but at the same time cannot rely on *when* it overflows. This is a
standard "trick" to formally handle resource exhaustion without having to
specify exactly how much of that resource is available.

This gives LLVM the freedom to grow or shrink stack frames. But this dos *not*
mean that stack overflows are UB!

Also, we optimize the function

 void overflow() {
  overflow();
}

to

; Function Attrs: nounwind readnone sspstrong uwtable
define dso_local void @"?overflow@@YAXXZ"() local_unnamed_addr #0 {
entry:
  ret void
}

This looks like an instance of the C++ forward progress assumption?
(C AFAIK only has that assumption for [some] loops, not recursive functions, so
arguably this is a miscompilation for C.)

Kind regards,
Ralf

Please keep in mind non-C/C++ frontends. For example, in Rust, we promise to
avoid all undefined behavior in safe code. There is no reasonable compositional
analysis that can statically detect stack overflows (I know safety-critical
systems are subjected such analyses, but those could not reasonably be enforced
on all Rust code -- most of them just forbid recursion, for example), so we
defer to a dynamic check instead: we have a guard page for every thread, and we
make sure that for big stack frames being pushed, every page covered by that
frame is touched at least once (so that we will never skip the guard page).

If LLVM considers stack overflows as UB, the guard page is not enough. Dynamic
checks cannot guard against UB exploited by the compiler. So, in that case it
would become basically impossible to compile a safe language to LLVM, as far as
I can see.

I think this is more target-platform specific than input-language. My
motivation was to preserve Windows SEH behavior where one can
(reliably?) catch stack overflows using a __try __except handler. I
think its ABI even specifies that there must be stack probing. The
original version of https://reviews.llvm.org/D59978 would not consider
that such an asynchronous exception could be a cause for jumping to
the unwind handler with the argument that "LLVM does not model
asynchronous exceptions" (cite from
https://clang.llvm.org/docs/MSVCCompatibility.html) anyway.

As Nevin pointed out, other platforms not require stack probing, do
not have Structured Exception Handling, but may call the signal
handler (which we also do not model).

So I wrote this RFC to clarify which guarantees we are giving in case
of stack overflow, especially in situations where the compilation
target does not give such guarantees.

> First: In some situations, we already assume that stack overflows are
> unobservable behaviours. For instance, optimizations of stack
> variables cause the stack frames to be smaller, such that the stack
> overflow does not occur where it would have in an unoptimized program.

That's fine. Basically, the abstract machine can be specified to
*non-deterministically* trigger a stack overflow any time the stack grows. So
essentially the running program has to be prepared that the stack might overflow
any time, but at the same time cannot rely on *when* it overflows. This is a
standard "trick" to formally handle resource exhaustion without having to
specify exactly how much of that resource is available.

This gives LLVM the freedom to grow or shrink stack frames. But this dos *not*
mean that stack overflows are UB!

What is it then? What reference says it is/is not UB?

> Also, we optimize the function
> ```
> void overflow() {
> overflow();
> }
> ```
> to
> ```
> ; Function Attrs: nounwind readnone sspstrong uwtable
> define dso_local void @"?overflow@@YAXXZ"() local_unnamed_addr #0 {
> entry:
> ret void
> }
> ```

This looks like an instance of the C++ forward progress assumption?
(C AFAIK only has that assumption for [some] loops, not recursive functions, so
arguably this is a miscompilation for C.)

If stack overflow is UB, then this is a valid optimization.
If the compiler is allowed to optimize the stack frame size to zero
(=> tail call), we would effectively get an infinite loop, also a
valid outcome of UB.

If it is not UB, what is the defined behaviour?

If this is an instance of the forward-progress assumption, we should
maybe discuss it here:
https://reviews.llvm.org/D65718

Should we have a similar document clarifying the behavior of stack overflows?

Michael

Hi,

I think this is more target-platform specific than input-language. My
motivation was to preserve Windows SEH behavior where one can
(reliably?) catch stack overflows using a __try __except handler. I
think its ABI even specifies that there must be stack probing. The
original version of https://reviews.llvm.org/D59978 would not consider
that such an asynchronous exception could be a cause for jumping to
the unwind handler with the argument that "LLVM does not model
asynchronous exceptions" (cite from
https://clang.llvm.org/docs/MSVCCompatibility.html) anyway.

As Nevin pointed out, other platforms not require stack probing, do
not have Structured Exception Handling, but may call the signal
handler (which we also do not model).

So I wrote this RFC to clarify which guarantees we are giving in case
of stack overflow, especially in situations where the compilation
target does not give such guarantees.

My point is that right now, it is possible to reliably catch Stack Overflows
even on non-Windows systems, e.g. by setting up the guard page correctly. Doing
this properly surely requires target-specific actions, which will most likely
not be modeled by LLVM. Declaring stack overflow as UB would remove this
ability, which would be a serious problem for languages that *have to* reliably
catch Stack Overflows.

If Rust fails to catch a Stack Overflow, that's a serious soundness bug in the
compiler. It would be as bad as Rust failing to catch a dereferenced NULL
pointer. The difference is that Rust's type system can rule out the NULL
pointer case statically (so making them UB in LLVM is fine), while ruling out
stack overflows is done dynamically (and it is fundamentally impossible to
reliably do dynamic UB checks *after* passing through an IR that exploited this UB).

That's fine. Basically, the abstract machine can be specified to
*non-deterministically* trigger a stack overflow any time the stack grows. So
essentially the running program has to be prepared that the stack might overflow
any time, but at the same time cannot rely on *when* it overflows. This is a
standard "trick" to formally handle resource exhaustion without having to
specify exactly how much of that resource is available.

This gives LLVM the freedom to grow or shrink stack frames. But this dos *not*
mean that stack overflows are UB!

What is it then? What reference says it is/is not UB?

I don't know what C/C++ says, and AFAIK LLVM doesn't document anything about
this. All I am saying is that, if LLVM is to be used as the backend language in
a compiler for a safe language (one that protects its users for UB), it is a
*hard requirement* that LLVM does not make Stack Overflows UB.

Then I proposed a scheme that permits growing and shrinking stack frames in a
semantics that does not make Stack Overflow UB. This was in response to a
concern you raised about growing or shrinking stack frames being a problem if
LLVM declared stack overflow as not being UB.

Also, we optimize the function

 void overflow() {
  overflow();
}

to

; Function Attrs: nounwind readnone sspstrong uwtable
define dso_local void @"?overflow@@YAXXZ"() local_unnamed_addr #0 {
entry:
  ret void
}

This looks like an instance of the C++ forward progress assumption?
(C AFAIK only has that assumption for [some] loops, not recursive functions, so
arguably this is a miscompilation for C.)

If stack overflow is UB, then this is a valid optimization.
If the compiler is allowed to optimize the stack frame size to zero
(=> tail call), we would effectively get an infinite loop, also a
valid outcome of UB.

If it is not UB, what is the defined behaviour?

The defined behavior is to loop forever, or abort at some point due to a stack
overflow. So, there are two possible defined behaviors. The compiler has to
make sure that any time the program runs, it exhibits one of these two behaviors.

Even programs with defined behavior can have different outcomes when run /
compiled multiple times. That's just a normal consequence of non-determinism.
This is similar to e.g. a program that starts two threads and has both of them
print to stdout: sometimes one thread will come first, sometimes another thread
will come first. The compiler is allowed to optimize this program in a way that
the order becomes fixed (e.g. by inlining both of them into the main thread).
The compiler can also keep the non-determinism so that it gets resolved at run-time.

If this is an instance of the forward-progress assumption, we should
maybe discuss it here:
https://reviews.llvm.org/D65718

Should we have a similar document clarifying the behavior of stack overflows?

That seems like a good idea.

Kind regards,
Ralf