Exceptions and performance

Hello,

I have been long taught that exceptions are considered the bad parts in the design of C++ language, and one should avoid using it since it’s hard to write exception-safe code (I have no disagreement on this part) and it impedes compiler optimization.

However, when recently I dig into the implementation of exception handling mechanism in LLVM recently, my impression is that using exceptions should instead improve performance (in the common case that no exception is thrown out), compared with the traditional approach of returning an error code in every function that can fail: no error-code-checking logic is executed in the fastpath, and error-handling code are moved from the main binary to the exception table, so the CPU is doing less work, and also instruction cache locality should be improved. Is my understanding correct?

So my question is:
(1) Is the argument that ‘exception hurts compiler optimization, and should not be used in perf-sensitive code’ outdated now? Or more generally, what are the pros and cons of using exception vs error code, from a LLVM backend developer’s perspective?

(2) In current LLVM design, is there still any optimization that is prevented by using exceptions? Or is the current LLVM already able to provide the same optimization quality for code using exceptions, compared with code that does not?

Thanks!
Haoran

Hello,

I have been long taught that exceptions are considered the bad parts in the design of C++ language, and one should avoid using it since it's hard to write exception-safe code (I have no disagreement on this part) and it impedes compiler optimization.

However, when recently I dig into the implementation of exception handling mechanism in LLVM recently, my impression is that using exceptions should instead improve performance (in the common case that no exception is thrown out), compared with the traditional approach of returning an error code in every function that can fail: no error-code-checking logic is executed in the fastpath, and error-handling code are moved from the main binary to the exception table, so the CPU is doing less work, and also instruction cache locality should be improved. Is my understanding correct?

(this is probably a bit too broad a topic for this forum/tends to end
up in heated discussions, etc, but let's see how we go)

I believe one of the main reasons your understanding there might be
incorrect: Not every function returns an error code. But essentially
every function can throw (yes, you can annotate them - but it's one of
those cases where C++ may've got the defaults wrong (like const, etc)
and it's impractical/unlikely someone's going to effectively maintain
noexcept on most of their codebase (and there are performance problems
with noexcept - since it's not UB to throw from such a function, it's
guaranteed to call terminate, so you have to generate exception
handling code in every noexcept function))

So my question is:
(1) Is the argument that 'exception hurts compiler optimization, and should not be used in perf-sensitive code' outdated now?

I don't believe so - having every function call being potentially
control flow changing inhibits certain optimizations that would
otherwise be possible (certain code motion, reordering, etc can't be
done) compared to a relatively smaller number of explicit
error-handling calls.

Or more generally, what are the pros and cons of using exception vs error code, from a LLVM backend developer's perspective?

Exceptions inhibit code motion, but allow for smaller code/keeping the
cold path/handling further away. I think that's basically the
mechanical tradeoff (setting aside issues of readability,
maintainability, etc - having an explicit keyword/notation on calls
that can throw could help a lot from a maintainability perspective,
for instance)

(2) In current LLVM design, is there still any optimization that is prevented by using exceptions? Or is the current LLVM already able to provide the same optimization quality for code using exceptions, compared with code that does not?

I believe that exceptions do still hinder optimizations when compared
to code that doesn't throw or use a return code*. While that may seem
like an unfair comparison, it's my understanding that it's part of the
excessive cost of exceptions. Not to mention how expensive they can be
when they are thrown, compared to explicit error handling.

* Note this isn't a case of missed optimization opportunities (well,
perhaps in LTO it's a missed opportunity - if we did whole program
analysis to prove certain call trees never throw in practice, then we
could propagate that information and improve optimizations) - but a
limitation of exceptions.

- Dave

There is a fair amount of dispute and detail here, and real benchmarks can be difficult to write, because you often end up in arguments about whether or not the two styles of coding are equivalent or not.

But I agree with Dave–exceptions generally inhibit optimization.

One way to think about this is that, generally speaking, the less the compiler can prove about a program, the less aggressive it can be with optimizations. If it doesn’t know what the control flow will exactly look like, it can’t do certain transformations. Exceptions add a conditional goto from every single function call (without good noexcept hygiene, which practically no program has), to the exception handler or cleanup landing pad (whether the exception handler is written in the code explicitly or not, something has to run the destructors of objects that are going out of scope.) Worse, inlining doesn’t really save you, because any function call that gets inlined also will have an implied conditional goto for any function it calls, now to two possible landing pads. Is it safe to move code across this goto? It’s comparatively hard to prove things about that.

Worse, a compiler generally know if a function call throws or not, or event the type of exception that will get thrown. It definitely knows what the return type and value of a function is.

It is very easy for a compiler to reason about the error checking around a function call–the code is all there and explicit. All of this restricts what it can prove about the program. And the less it can prove, the less aggressive it can be.

Hello David and Sterling, thanks for the reply!

I have no intention to heat up the discussion, please pardon me if I asked questions that might sound silly to you – it’s just that I truly didn’t understand since I’m not expert in llvm or optimizer or exception handling at all.

I believe one of the main reasons your understanding there might be
incorrect: Not every function returns an error code. But essentially every function can throw

I think if a function could throw, then it should return an error code in the no-except approach, otherwise there would be no way to convey the error to caller.
So I did’t get the point that “essentially every function can throw”. Indeed by default most of the C++ functions may throw (and some of them may actually be non-throwing), but LLVM should (at least in theory) be able to correctly figure out which functions may actually throw as long as it has the definition, is that correct?

So if a function does not call external functions, the whole control flow graph (including the throw parts) should be available to LLVM even if there are exceptions, so I think there should be no limitation in moving code around? In the case that the function calls external functions, then it should already be hard to move code across that external function call, so I thought having exceptions does not pose an additional limitation. Maybe I misunderstood some part?

Exceptions add a conditional goto from every single function call (without good noexcept hygiene, which practically no program has), to the exception handler or cleanup landing pad (whether the exception handler is written in the code explicitly or not, something has to run the destructors of objects that are going out of scope.)

I did’t quite understand what is different between this and an error-checking ‘if’. Let’s say you have a function call f(). Then with error code, the logic would look like
error_code = f()
if (error_code) {
(call all destructors) ← generated by llvm

return error_code;

}

and with exception, the code would (conceptually) look like:
exceptions = f()
if (exceptions) { ← generated by llvm
call all destructors; ← generated by llvm
resume exceptions; ← generated by llvm
} ← generated by llvm
The only difference is that the code for ‘if’ branch would physically reside in the exception table, but I couldn’t see why this would make a difference to the optimizer. The optimizer could in theory just optimize the code as if it were a physically-existing ‘if’ branch (and achieve the same optimized code as the error-code approach), and then finally put the ‘if’ branch into the exception table. Am I missing some piece?

Thanks,
Haoran

Sterling Augustine <saugustine@google.com> 于2020年8月13日周四 下午3:35写道:

I think if a function could throw, then it should return an error code in the no-except approach, otherwise there would be no way to convey the error to caller.

And here is the rub. What a function should do, and what it is required to do by the standard (and therefore what the the compiler can rely on), are very different things.

Am I missing some piece?

I think the thing you are missing is what the compiler actually knows and can prove, versus what it might know if the code is well structured. (Most code is not well structured.)

What does the standard say the following function could possibly return? What type will the error be?

void Foo();

Now, according to the standard, what exceptions can it throw, and what are the possible types of those exceptions?

Remember, we are talking about what the compiler can actually prove.

Anyway, I won’t get into this further.

Hello David and Sterling, thanks for the reply!

I have no intention to heat up the discussion, please pardon me if I asked questions that might sound silly to you -- it's just that I truly didn't understand since I'm not expert in llvm or optimizer or exception handling at all.

I believe one of the main reasons your understanding there might be
incorrect: Not every function returns an error code. But essentially every function can throw

I think if a function could throw, then it *should* return an error code in the no-except approach, otherwise there would be no way to convey the error to caller.

I don't know of any codebase that uses "nothrow" ubiquitously enough
to represent this situation. In general in C++ code there will be many
more functions that can in theory throw (as far as the compiler's
concerned - ie: The functions are not marked "nothrow") but never do
in practice, than there would be functions that actually need error
handling/failures.

So I did't get the point that "essentially every function can throw". Indeed by default most of the C++ functions may throw (and some of them may actually be non-throwing), but LLVM should (at least in theory) be able to correctly figure out which functions may actually throw as long as it has the definition, is that correct?

As long as it has the definition of the entire call graph - which is
unlikely without whole program analysis.

So if a function does not call external functions, the whole control flow graph (including the throw parts) should be available to LLVM even if there are exceptions, so I think there should be no limitation in moving code around?

That's a fairly uncommon property to hold - most code calls other
external functions. (certainly external to a file - and in many cases
external even to whole program optimization, depending on how you
define "whole program" and how you build your code, whether you have
access to the source of every library to rebuild them, etc)

In the case that the function calls external functions, then it should already be hard to move code across that external function call, so I thought having exceptions does not pose an additional limitation. Maybe I misunderstood some part?

There's still things that could be done moving across a call to
external code - eg: int f1() { static int i = 3; ++i; f2(); return i;
} - a compiler could move the increment back and forth over f2
depending on its needs. If 'f2' can throw, then it's unsafe to move
the increment.

Exceptions add a conditional goto from every single function call (without good noexcept hygiene, which practically no program has), to the exception handler or cleanup landing pad (whether the exception handler is written in the code explicitly or not, something has to run the destructors of objects that are going out of scope.)

I did't quite understand what is different between this and an error-checking 'if'. Let's say you have a function call f(). Then with error code, the logic would look like
error_code = f()
if (error_code) {
  (call all destructors) <-- generated by llvm
  return error_code;
}
and with exception, the code would (conceptually) look like:
exceptions = f()
if (exceptions) { <-- generated by llvm
  call all destructors; <-- generated by llvm
  resume exceptions; <-- generated by llvm
} <-- generated by llvm
The only difference is that the code for 'if' branch would physically reside in the exception table, but I couldn't see why this would make a difference to the optimizer. The optimizer could in theory just optimize the code as if it were a physically-existing 'if' branch (and achieve the same optimized code as the error-code approach), and then finally put the 'if' branch into the exception table. Am I missing some piece?

The ABI - the agreement between compilation of different files/objects
limits what the compiler can do to change the mechanism for exception
handling. The compiler building 'f()' and the compiler building f()'s
caller must agree on how the exception will be modeled - so the
compiler can't readily make a decision about tradeoffs of different
schemes for exception propagation depending on profiles or other
things. Yes, with a full ABI break you could have a completely
different exception handling scheme that doesn't use tables at all -
adds an extra implicit return value from every function call wvhich is
the "did this throw" bit, with some other sidetable of the actual data
that was thrown, etc - I believe this is the sort of model Swift uses,
for instance. But that has a different set of performance tradeoffs (&
again, the defaults are problematic - C++ compilers must assume all
functions throw unless annotated otherwise, and C++ writers don't
annotate them enough to help much there)

- Dave

Thanks for the insights David!

For your first 3 points, is it correct to understand it as following: the external function prototypes are missing reliable information on whether the function throws and what exceptions it may throw (due to C++'s design failures and that it is impractical to maintain such information in a large codebase), which is the main cause that code using exceptions cannot be optimized as effectively as code that does not.

The ABI - the agreement between compilation of different files/objects
limits what the compiler can do to change the mechanism for exception
handling. The compiler building ‘f()’ and the compiler building f()'s
caller must agree on how the exception will be modeled - so the

I don’t have much expertise on optimizer, but I think the compiler doesn’t need to break ABI to optimize those code…?

The two code snippets in my example has effectively the same structure, whatever optimization (e.g. move code around) that would work in the error-code case should also work in the exception case?
For example, if the compile decides to move some code after f() to before f(), it would do below for the error-code case (just to demonstrate that the same transformation could work for both exception and no-exception code):
[moved logic…]
if (err = f()) {
[undo side effect of moved logic…]
call destructors

return err
}
and for code that uses exception:

[moved logic…]
if (exception = f()) {
[undo side effect of moved logic…]
call destructors

resume exception

}
and this is not breaking the ABI at all, right?

David Blaikie <dblaikie@gmail.com> 于2020年8月13日周四 下午5:06写道:

Thanks for the insights David!

For your first 3 points, is it correct to understand it as following: the external function prototypes are missing reliable information on whether the function throws and what exceptions it may throw (due to C++'s design failures and that it is impractical to maintain such information in a large codebase), which is the main cause that code using exceptions cannot be optimized as effectively as code that does not.

That (lack of precise throw information for external functions due to
requiring manual annotation of nothrow), I believe, is a significant
issue, yes.

The ABI - the agreement between compilation of different files/objects
limits what the compiler can do to change the mechanism for exception
handling. The compiler building 'f()' and the compiler building f()'s
caller must agree on how the exception will be modeled - so the

I don't have much expertise on optimizer, but I think the compiler doesn't need to break ABI to optimize those code..?
The two code snippets in my example has effectively the same structure, whatever optimization (e.g. move code around) that would work in the error-code case should also work in the exception case?
For example, if the compile decides to move some code after f() to before f(), it would do below for the error-code case (just to demonstrate that the same transformation could work for both exception and no-exception code):
[moved logic...]
if (err = f()) {
    [undo side effect of moved logic..]
    call destructors
    return err
}
and for code that uses exception:
[moved logic...]
if (exception = f()) {
    [undo side effect of moved logic..]
    call destructors
    resume exception
}
and this is not breaking the ABI at all, right?

Sorry, yes. No difference between explicit error handling and
exceptions. The difference is in all the functions that don't have
explicit error handling but (in the exception-using equivalent code)
aren't marked nothrow (where, without exceptions, the compiler could
move the increment across the call).

- Dave

Thanks for the reply.

Sorry, yes. No difference between explicit error handling and
exceptions. The difference is in all the functions that don’t have
explicit error handling but (in the exception-using equivalent code)
aren’t marked nothrow (where, without exceptions, the compiler could
move the increment across the call).

According to your argument, assuming that correctly annotating throw specifications is impractical, if one wants to mitigate the performance implications of exceptions, not using exceptions oneself is not going to be helpful (since as you said, explicit error-handling and exceptions have no difference), and instead one must completely disable exceptions with -fno-exceptions?

On the other hand, if the project is not built with -fno-exceptions, using exceptions to replace explicit error-handling would not degrade performance. In other words, compare the 3 settings:
(1) A project built with -fno-exception, and use explicit error-code-handling

(2) A project built without -fno-exception, but does not use exception either, and use explicit error-code-handling
(3) A project that use exceptions for error handling.
Then setting (1) would be fastest, and setting (2)(3) would have similar performance (maybe (3) would be even faster than (2) due to better code locality?). Do you agree with this?

David Blaikie <dblaikie@gmail.com> 于2020年8月13日周四 下午6:14写道:

Thanks for the reply.

Sorry, yes. No difference between explicit error handling and
exceptions. The difference is in all the functions that don't have
explicit error handling but (in the exception-using equivalent code)
aren't marked nothrow (where, without exceptions, the compiler could
move the increment across the call).

According to your argument, assuming that correctly annotating throw specifications is impractical, if one wants to mitigate the performance implications of exceptions, not using exceptions oneself is not going to be helpful (since as you said, explicit error-handling and exceptions have no difference), and instead one must completely disable exceptions with -fno-exceptions?

On the other hand, if the project is not built with -fno-exceptions, using exceptions to replace explicit error-handling would not degrade performance.

Once you get past the nothrow default problems, then you probably have
to deal with the performance tradeoffs between the current strategy
for exception implementations (table based, etc) compared to the
tradeoffs for explicit error handling. You'd probably find that using
exceptions for /every/ error return would not be the right perf
tradeoff for many use cases where errors are at least somewhat common.
So you'd probably end up with a hybrid solution - some things using
exceptions where the failure is rare/going to be costly anyway
(stopping the process, showing an error to a user/asking for user
input about retrying, etc) and then still using explicit error
handling for other things.

Yes, all my discussion is based on the assumption that exceptions are really exceptional, so I’m only concerned about the path that no exception actually happens. If the error is rather common of course you are right that the cost of handling an exception is going to be much higher than using an error code.

David Blaikie <dblaikie@gmail.com> 于2020年8月13日周四 下午7:39写道:

There's a second-order effect here too. Everyone knows exceptions are slow, so everyone writes fast-path code with explicit error returns. As a result, modern branch predictors are *really* good at predicting the no-error case. A lot of the arguments about checked return values being slow date back to in-order processors where adding any instructions on the fast path was always a slowdown, even if they were not-taken branches.

I did some experiments about a year ago with different hand-written assembly implementations of error handling control flow patterns and it was really hard to beat the checked return on recent Intel hardware. This was a microbenchmark, so didn't see effects from increased cache usage.

There's also the fact that explicit error checking lets the programmer simplify the CFG explicitly. If you're calling two functions that don't depend on any data flow between them, you can do something like:

auto a = someFn();
auto b = someOtherFn();
if (!a || !b)
  doSomeErrorHandlingStuffs();

Transforming exception-driven code into this structure would be an invalid transform unless the compiler could prove that someOtherFn() has no observable side effects.

David

I did some experiments about a year ago with different hand-written
assembly implementations of error handling control flow patterns and it
was really hard to beat the checked return on recent Intel hardware.
This was a microbenchmark, so didn’t see effects from increased cache usage.

That’s a very interesting result. Thanks for the insights!
Just out of curiosity, did you remember how much slowdown you get between checked return and no check at all (simply ignore the error)?

There’s also the fact that explicit error checking lets the programmer
simplify the CFG explicitly. If you’re calling two functions that don’t
depend on any data flow between them, you can do something like:

auto a = someFn();
auto b = someOtherFn();
if (!a || !b)
doSomeErrorHandlingStuffs();

Transforming exception-driven code into this structure would be an
invalid transform unless the compiler could prove that someOtherFn() has
no observable side effects.

I didn’t fully understand this part. If someOtherFn() has obversable side effects, you cannot convert

auto a = someFn(); if (!a) { … }

auto b = someOtherFn() if (!b) { … }
to your code snippet either…

David Chisnall via llvm-dev <llvm-dev@lists.llvm.org> 于2020年8月14日周五 上午7:55写道:

I did some experiments about a year ago with different hand-written
assembly implementations of error handling control flow patterns and it
was really hard to beat the checked return on recent Intel hardware.
This was a microbenchmark, so didn't see effects from increased cache usage.

That's a very interesting result. Thanks for the insights!
Just out of curiosity, did you remember how much slowdown you get between checked return and no check at all (simply ignore the error)?

There's also the fact that explicit error checking lets the programmer
simplify the CFG explicitly. If you're calling two functions that don't
depend on any data flow between them, you can do something like:

auto a = someFn();
auto b = someOtherFn();
if (!a || !b)
        doSomeErrorHandlingStuffs();

Transforming exception-driven code into this structure would be an
invalid transform unless the compiler could prove that someOtherFn() has
no observable side effects.

I didn't fully understand this part. If someOtherFn() has obversable side effects, you cannot convert
auto a = someFn(); if (!a) { ... }
auto b = someOtherFn() if (!b) { ... }
to your code snippet either..

David was specifically talking about the case where SomeOtherFn and
SomeFn "don't depend on any data flow between each other".

eg: you are performing lookups on two containers - compiler doesn't
know these lookups have no side effects, so it can't collapse the two,
but you as the author can see that readily.

David was specifically talking about the case where SomeOtherFn and
SomeFn “don’t depend on any data flow between each other”.

eg: you are performing lookups on two containers - compiler doesn’t
know these lookups have no side effects, so it can’t collapse the two,

but you as the author can see that readily.

I see, thanks for the clarification.

David Blaikie <dblaikie@gmail.com> 于2020年8月17日周一 下午3:14写道: