Unwind, exception handling, debuggers and profilers

Folks,

I'm sorry for getting at this again, but this will not be the last
discussion on the topic, so let's just get to business. We're about to
merge the last critical patch to make EHABI compatible with other EH
mechanisms in LLVM (D3079), and that has unearthed a few issues with
the function attributes.

Logan's blog post [1] contains a proposal to split unwinding from
exceptional logic which I think we should give it a try. I'm trying to
make it as backward compatible as possible, but the point is to make
the IR authoritative on how and when to emit what kind of unwind
information.

** Unwinding **

AFAIK, there are three ways to unwind the stack:

1. Checking the return information on the stack (LR). This might be
optimized away, so debuggers and profilers can't rely on it. Binaries
with no debug or EH information at all have to resort to this when
back-tracing during a segfault or similar.

2. Dwarf unwinding. This is a carefully constructed stack unwinding
Dwarf information for each frame that debuggers and profilers can use
to gather precise information on the stack, typically stored on
.eh_frame sections.

3. Itanium-compatible exception handling. This is another format of
unwind tables using personality routines and similar unwinding logic
to Dwarf CFI directives, also on .eh_frame. This is why you can use
CFI to build the EH tables, and why debuggers can also use this table
to unwind the stack.

** Exception Handling **

LLVM has four types of EH: Dwarf, ARM, SjLj and Win64. Of them, only
SjLj doesn't need unwind tables, and each of the others, even being
Itanium-compatible, need slightly different logic.

In LLVM, all targets that use DwarfCFIException are using the same
tables as the debugger and profilers would, but ARM and Win64 have
separate unwind logic. This is where the problem begins.

We're left with three EH types:

1. No tables (SjLj)
2. Dwarf tables (DwarfCFI)
3. Specific EH tables (ARM, Win64?)

** Debug & Profiling **

In debug/profile mode (-g, -pg), none of the optimizations that prune
unwind information should be allowed to run. I believe currently this
is informed via the uwtable/nothrow function attributes, but since
their use is controversial, we can reach situations where information
is indeed removed when it shouldn't happen. (see Logan's post).

** Function Attributes Proposal **

I still don't have a clear idea on what do we need, but being overly
conservative, we should plan for every behaviour to be expressed by a
flag, and during the discussion, merge similar behaviour and possibly
remove unused flags along the way.

Today we have uwtable and nothrow, which are interchangeably being
used to mean debug and EH unwinding, which is just wrong. Logan's
proposal is to split uwtable and ehtable, nothrow and nounwind and to
understand which trumps which if combined.

In my view, nounwind has no purpose. I may be wrong, but I can't think
of a case where you want to generate debug unwind tables and NOT want
to unwind a particular function on a purely non-EH context. Supposing
nounwind has meaning, I'm happy with the semantics Logan proposes in
his post.

A function without any of the attributes below should emit *no* tables at all.

The remaining possibilities are:

* uwtable
  - Generated only when -g or -pg are specified
  - Option -fno-unwind-tables loses meaning (unless nounwind has meaning)
  - Generate full EH/Debug tables on all archs

* ehtable
  - Generated for EH only (front-end/arch/lang specific)
  - Could be forced enabled/disabled via -feh-tables/-fno-eh-tables
  - Only emits EH directives on ARM, full debug on others, nothing on SjLj

* nothrow
  - Leaf functions, throw(), languages without EH, etc.
  - On its own, do nothing (no tables are emitted anyway)
  - +uwtable, do nothing (we don't want to break debug)
  - +ehtable, emit CantUnwind (the whole purpose)

* nounwind
  - No idea why, but assuming there is a reason
  - On its own, do nothing (no tables are emitted anyway)
  - +uwtable, emit CantUnwind (given Logan's semantics)
  - +ehtable, emit CantUnwind (given Logan's semantics)

The primary reason for adding the ehtable attribute is to be able to
control the nothrow flag correctly. Other reasons are to limit
emitting tables to archs that support it (ie. no SjLj) and to focus
-fno-eh-table on EH info only, not Debug, so you don't get broken
debug info when you just add "-g" to a build that already has
-fno-eh-tables.

The option -fno-unwind-tables should either be removed (if nounwind
has no meaning apart from EH context), or carefully merged with
-fno-eh-tables. Logan can expand on that.

The reason why uwtable emits both EH and debug is backwards
compatibility. ARM currently emits both directives for binutils' sake,
and others have fused Debug/EH directives anyway. This attribute
should also prevents any optimization related to stack unwinding, even
on recursive or tail calls.

LangRef would have to change, but I don't think old IR would stop working.

Does that sound like a reasonable plan? Anything I haven't mentioned
that needs mentioning? Any conflict that this will generate on any
optimization pass?

cheers,
--renato

[1] http://loganchien.github.io/llvm/nounwind.html

Just a couple things, I haven't really been following all of the
discussion.

Folks,

I'm sorry for getting at this again, but this will not be the last
discussion on the topic, so let's just get to business. We're about to
merge the last critical patch to make EHABI compatible with other EH
mechanisms in LLVM (D3079), and that has unearthed a few issues with
the function attributes.

Logan's blog post [1] contains a proposal to split unwinding from
exceptional logic which I think we should give it a try. I'm trying to
make it as backward compatible as possible, but the point is to make
the IR authoritative on how and when to emit what kind of unwind
information.

** Unwinding **

AFAIK, there are three ways to unwind the stack:

1. Checking the return information on the stack (LR). This might be
optimized away, so debuggers and profilers can't rely on it. Binaries
with no debug or EH information at all have to resort to this when
back-tracing during a segfault or similar.

2. Dwarf unwinding. This is a carefully constructed stack unwinding
Dwarf information for each frame that debuggers and profilers can use
to gather precise information on the stack, typically stored on
.eh_frame sections.

3. Itanium-compatible exception handling. This is another format of
unwind tables using personality routines and similar unwinding logic
to Dwarf CFI directives, also on .eh_frame. This is why you can use
CFI to build the EH tables, and why debuggers can also use this table
to unwind the stack.

** Exception Handling **

LLVM has four types of EH: Dwarf, ARM, SjLj and Win64. Of them, only
SjLj doesn't need unwind tables, and each of the others, even being
Itanium-compatible, need slightly different logic.

In LLVM, all targets that use DwarfCFIException are using the same
tables as the debugger and profilers would, but ARM and Win64 have
separate unwind logic. This is where the problem begins.

We're left with three EH types:

1. No tables (SjLj)
2. Dwarf tables (DwarfCFI)
3. Specific EH tables (ARM, Win64?)

** Debug & Profiling **

In debug/profile mode (-g, -pg), none of the optimizations that prune
unwind information should be allowed to run. I believe currently this
is informed via the uwtable/nothrow function attributes, but since
their use is controversial, we can reach situations where information
is indeed removed when it shouldn't happen. (see Logan's post).

It is an article of faith among debugger and debug-info people that
adding -g must not affect the generated code in any way. Although I
have never had to investigate it closely, DWARF unwind info should be
able to describe whatever the compiler is willing to produce. And if
the program itself will not do unwinding, it's okay for the debug
unwind info to be imperfect (see my next comment).

Please don't require/assume/imply that -g should disturb optimizations,
in particular function attributes that affect optimization should not
be present or absent based on -g.

** Function Attributes Proposal **

I still don't have a clear idea on what do we need, but being overly
conservative, we should plan for every behaviour to be expressed by a
flag, and during the discussion, merge similar behaviour and possibly
remove unused flags along the way.

Today we have uwtable and nothrow, which are interchangeably being
used to mean debug and EH unwinding, which is just wrong. Logan's
proposal is to split uwtable and ehtable, nothrow and nounwind and to
understand which trumps which if combined.

In my view, nounwind has no purpose. I may be wrong, but I can't think
of a case where you want to generate debug unwind tables and NOT want
to unwind a particular function on a purely non-EH context. Supposing
nounwind has meaning, I'm happy with the semantics Logan proposes in
his post.

Um, does your concept of "unwind" include "debugger displays a backtrace"?
A debugger likes to be able to display the chain of subprogram activations
that led to the current stopping point (backtrace), but this is different
from manipulating the process state to imitate the effect of a sequence of
subprogram de-activations (unwind).
A debugger also likes to be able to virtually present the process state
during some not-the-most-recent activation, which is like a virtual unwind,
but this is not a reason to require full EH tables in all cases; this
particular debugger feature is well understood by users to be lossy.

--paulr

** Unwinding **

AFAIK, there are three ways to unwind the stack:

1. Checking the return information on the stack (LR). This might be
optimized away, so debuggers and profilers can't rely on it. Binaries
with no debug or EH information at all have to resort to this when
back-tracing during a segfault or similar.

2. Dwarf unwinding. This is a carefully constructed stack unwinding
Dwarf information for each frame that debuggers and profilers can use
to gather precise information on the stack, typically stored on
.eh_frame sections.

3. Itanium-compatible exception handling. This is another format of
unwind tables using personality routines and similar unwinding logic
to Dwarf CFI directives, also on .eh_frame. This is why you can use
CFI to build the EH tables, and why debuggers can also use this table
to unwind the stack.

I think this is just 2. It uses .eh_frame for unwinding proper. The
only difference in .eh_frame is that there is a personality function
defined.

** Exception Handling **

LLVM has four types of EH: Dwarf, ARM, SjLj and Win64. Of them, only
SjLj doesn't need unwind tables, and each of the others, even being
Itanium-compatible, need slightly different logic.

In LLVM, all targets that use DwarfCFIException are using the same
tables as the debugger and profilers would, but ARM and Win64 have
separate unwind logic. This is where the problem begins.

We're left with three EH types:

1. No tables (SjLj)
2. Dwarf tables (DwarfCFI)
3. Specific EH tables (ARM, Win64?)

** Debug & Profiling **

In debug/profile mode (-g, -pg), none of the optimizations that prune
unwind information should be allowed to run.

No. The -g option should never change the set of optimizations that
are run. We can add a -Og that means only debug info friendly
optimizations, but it should be independent.

A function without any of the attributes below should emit *no* tables at all.

The remaining possibilities are:

* uwtable
  - Generated only when -g or -pg are specified

No. Se above note about -g.

  - Option -fno-unwind-tables loses meaning (unless nounwind has meaning)
  - Generate full EH/Debug tables on all archs

* ehtable
  - Generated for EH only (front-end/arch/lang specific)
  - Could be forced enabled/disabled via -feh-tables/-fno-eh-tables
  - Only emits EH directives on ARM, full debug on others, nothing on SjLj

So, I am not really familiar with how we do exception handling, so my
only opinion is about the X86-64 abi and LTOing files compiled with
and without -fno-asynchronous-unwind-tables. Having a table entry is
mandated by the ABI, and -fno-asynchronous-unwind-tables is a non abi
conformant
option for a user that really knows what it is doing.

What we do today then is that on x86-64 "clang -S" adds uwtable to all
functions and "clang -S -fno-asynchronous-unwind-tables" doesn't. The
net result is that we get tables for the same functions with and
without LTO. I would really like to keep this property and this simple
logic in the x86 backend.

From previous discussions it seems that the idea was for other

backends to read it as "make it possible to unwind past this
function". It that is not too specific, we could add other attributes,
for example:

* frame-pointer. Added by the FE when building with -fno-omit-frame-pointer.
* arm-unwind-table. The other unwind table format.

Cheers,
Rafael

In debug/profile mode (-g, -pg), none of the optimizations that prune
unwind information should be allowed to run.

Sigh... Sorry folks, I made the same mistake *again*. I don't mean to
change how passes are run, just to reduce the amount of disturbance in
the debug information. I should have said something like: "When -g is
set, changes to the unwind information should be preserved more
energetically". However, I'm only following Logan's post's ideas, I
don't know how much of that is actually possible.

And if
the program itself will not do unwinding, it's okay for the debug
unwind info to be imperfect.

Yes, but it has to have at least one way of implying the location of
the previous frame, which was the point of my comment.

Um, does your concept of "unwind" include "debugger displays a backtrace"?

Yes.

A debugger likes to be able to display the chain of subprogram activations
that led to the current stopping point (backtrace), but this is different
from manipulating the process state to imitate the effect of a sequence of
subprogram de-activations (unwind).
A debugger also likes to be able to virtually present the process state
during some not-the-most-recent activation, which is like a virtual unwind,
but this is not a reason to require full EH tables in all cases; this
particular debugger feature is well understood by users to be lossy.

The unwind information in Dwarf / ELF is on the same place EH unwind
info is: .eh_frame. The only difference I know is that EH needs the
personality routine to find the frame that is catching the exception,
while debuggers and profilers only do forced unwind, which only need
to know the position of the previous frame, as well as the values of
the stack variables on those frames (or in which registers they were,
etc).

I don't know about the Win64 EH style, but ARM EHABI is not that different!

The main differences are:

1. ARM specific unwind directives that GNU implemented almost
verbatim, and is what GAS uses to create EH unwind tables on ARM.

2. Short tables, which is a condensed set of stack unwind instructions
(1-2 bytes long each) on the word that the pointer to the code in
Dwarf would be.

None of which actually demand any difference in the IR level. The fact
that GNU implemented the ARM directives is also not a reason why we
can't use Dwarf CFI directives to encode EHABI logic in the ASM or obj
outputs. In a way, we should be moving to use CFI-only for all
back-ends.

So, in essence, EH tables and debug tables are the same thing. The
decision to create a new function attribute is mainly to control how
we emit CantUnwind information.

cheers,
--renato

I think this is just 2. It uses .eh_frame for unwinding proper. The
only difference in .eh_frame is that there is a personality function
defined.

If there is no debug information, it should still be possible to
unwind the stack via the saved LR on the stack, no?

If there is only line info, you could even print the function names
with just the LR.

But yes, Debug and EH unwinding should be identical (modulo the PR).

No. The -g option should never change the set of optimizations that
are run.

Bad wording. I meant fix the edge cases Logan reported on the LR
removal, which might have some effect (bigger frames by one word), but
that's discussion for another thread.

* uwtable
  - Generated only when -g or -pg are specified

No. Se above note about -g.

I don't see how not having unwinding information would change the
binary execution.

What we do today then is that on x86-64 "clang -S" adds uwtable to all
functions and "clang -S -fno-asynchronous-unwind-tables" doesn't.

This is remarkably similar to the behaviour I want to create. But that
can't be encoded in IR right now wrt. nothrow.

These are the options:

1. no attr: don't emit tables
2. nounwind: emit CantUnwind
3. uwtable: emit table
4. uwtable + nounwind: emit table

This is because uwtable means *also* debug/profiler use, and emitting
CantUnwind could stop them from unwinding, since there is no
information on how to continue unwinding the stack.

The semantics I want is to be able to separate between EH unwinding
and Debug/Profiler unwinding (even though they map to the same
physical tables), so that we DO emit CantUnwind in those cases.

* arm-unwind-table. The other unwind table format.

No. This is not an ARM specific issue. The table format is back-end
specific, and the IR has no business in interfering with it.

cheers,
--renato

Hi,

I wrote that article because I encountered an issue[1] when I was throwing-and-catching in a C++ program (which is not related to the -g, -pg, or -Og.) I am not familiar with the debug_info issue, thus I have no comments on for the impact to the unwind table and debugging information.

I think we can focus on this:

We would like to add a flag (or reusing -fno-unwind-tables) to Clang so that the user can disable the generation of unwind table for each function.

To address this issue, we have to consider several aspects:

  1. What does the meaning of -funwind-tables?

  2. What LLVM assembly will be generated by Clang? We have to compare the difference between C and C++ program.

  3. What are the meanings of nounwind and uwtable attribute? What do they guarantee? How do they work?

  4. Possible solution.

Let’s discuss these aspects one-by-one.

  1. Meaning of -funwind-tables

Logan,

Based on the current behaviour, you only need one flag: nounwind,
which should only be emitted if -fno-unwind-tables is chosen AND the
function can't unwind.

Also, do not assume that EHABI behaviour in LLVM is currently correct,
especially related to uwtable and nounwind. Those were made with the
x86_64's ABI in mind, and have only interoperated seriously with
DwarfCFIException until very recently.

There has to be a way to disable unwind tables, so either the "no
attribute" behaviour above is wrong or we need a new attribute
"noehtable".

There has to be a way to emit CantUnwind, so if the behaviour above is
right, the "uwtables" attribute is only related to forced unwind
(debug, profiler), not exception handling.

There has to be a way to map "throw()" into IR, and "nounwind" seems
to be the one to use. The fact that CantUnwind is only emitted without
"uwtable" reinforces the idea that "uwtable" is forced unwind.

cheers,
--renato

Hi Renato,

Based on the current behaviour, you only need one flag: nounwind,
which should only be emitted if -fno-unwind-tables is chosen AND the
function can’t unwind.

I don’t quite understand what do you mean here.

I was trying to say: due to the design of .ARM.exidx, you won’t be possible to emit both the cantunwind directive and the stack unwinding information at the same time (they share the same word.) Thus, if -funwind-tables is available, then we should ignore the nounwind attribute. Otherwise, the force unwind simply won’t work. That’s the reason why I said that if the function has uwtable attribute, then we should not emit cantunwind directive.

There has to be a way to disable unwind tables, so either the “no
attribute” behaviour above is wrong or we need a new attribute
“noehtable”.

IMO, noehtable attribute won’t solve the issue. In my article, I was using ehtable to provide the guarantee of the information to throw-and-catch the exception, but you are using in the converse way. In fact, uwtable and ehtable means almost the SAME table in ARM EHABI (except the LSDA.)

Furthermore, no exception handling table (LSDA) will be generated for the functions without the landingpad instruction, even -funwind-tables are given. IMO, adding noehtable attribute won’t reach your goal to remove the unnecessary unwind table.

Also, do not assume that EHABI behaviour in LLVM is currently correct,
especially related to uwtable and nounwind. Those were made with the
x86_64’s ABI in mind, and have only interoperated seriously with
DwarfCFIException until very recently.

If you don’t care about the backward compatibility at LLVM assembly level at all, then the simplest solution is to determine whether we should generate unwind table by the existence of the uwtable function attribute. The result should be:

  • no attribute => no table generated

  • with nounwind attribute => no table generated

  • with uwtable attribute => generate unwind table (with LSDA)

  • with uwtable+nounwind attribute => generate unwind table (without cantunwind)

This combination will work (including the interleaving of C and C++ programs). The will be only a little difference when the function really throws an exception and the front-end did not generate the landingpad instruction. Since clang will transform “throw ()” or “noexpect(true)” to:

define void @_Z3foov() nounwind {
invoke void @_Z9may_throwv() to label %1 unwind label %2

; :1
ret void

; :2
%3 = landingpad { i8*, i32 } personality i8* bitcast (i32 (…)* @__gxx_personality_v0 to i8*)
filter [0 x i8*] zeroinitializer
%4 = extractvalue { i8*, i32 } %3, 0
tail call void @__cxa_call_unexpected(i8* %4)
unreachable
}

The landingpad instruction will be emitted to catch every exceptions, and call __cxa_call_unexpected() to stop the unwinding procedure.

BUT, this means that all of the front-ends should be updated if they was not emitting uwtable attribute. I am afraid that this won’t be a viable solution.

Best regards,

Logan

Footnote

[1] To be precise, one additional word will be emit to mark the absence of LSDA but this word should be emitted anyway.

Based on the current behaviour, you only need one flag: nounwind,
which should only be emitted if -fno-unwind-tables is chosen AND the
function can't unwind.

I don't quite understand what do you mean here.

Current behaviour can be simplified by:

* just nounwind: emit can't unwind
* anything else: emit unwind tables

Since you have only two states, a single boolean flag is enough to
represent them.

That's the reason why I said that if the
function has uwtable attribute, then we should not emit cantunwind
directive.

This is not about the existence of different tables (thus why I
rejected Rafael's -arm-eh-tables idea), but the fact that we should
emit cant-unwind when we have throw() in C++ and we won't need forced
unwind, such as release binaries with exception handling.

If you don't care about the backward compatibility at LLVM assembly level at
all, then the simplest solution is to determine whether we should generate
unwind table by the existence of the uwtable function attribute.

I never said I don't care about backward compatibility, what I said is
that what's there now is not necessarily correct. This is the
behaviour that the x86_64 ABI demands, but not the ARM ABI, and we
can't implement the ARM ABI as if it was the x86_64 one.

- no attribute => no table generated
- with nounwind attribute => no table generated

Wouldn't the lack of table in a function just break the EH unwind? It
might be an implementation detail, but I wouldn't be surprised if the
personality routine would just bail and crash the program if it
couldn't find the previous frame.

Ie. I don't think "no tablles" == "can't unwind table".

BUT, this means that all of the front-ends should be updated if they was not
emitting uwtable attribute. I am afraid that this won't be a viable
solution.

The whole point was to change the semantics of the function attributes
uwtable and nounwind.

If this is out of the question, than this whole discussion is moot and
we should just force the x86_64 ABI on all EH targets, ie. always emit
the tables regardless of the attributes. And this is exactly how it is
now, so there's nothing to change.

cheers,
--renato

Hi Renato,

Current behaviour can be simplified by…

I am slightly confused by your goal. What are you trying to achieve? It will be good to fill in the following table so that we can have some common basis for discussion.

  • no attribute => _____

  • with nounwind attribute => _____

  • with uwtable attribute => _____

  • with uwtable+nounwind attribute => _____

  • just nounwind: emit can’t unwind
  • anything else: emit unwind tables

From your reply, it seems to me that you are suggesting following mapping (correct me if I am wrong.)

  • no attribute => emit unwind table
  • with nounwind attribute => emit cant unwind
  • with uwtable attribute => emit unwind table
  • with uwtable+nounwind attribute => emit unwind table

But what’s the difference between the existing behavior?

That’s the reason why I said that if the
function has uwtable attribute, then we should not emit cantunwind
directive.

This is not about the existence of different tables (thus why I
rejected Rafael’s -arm-eh-tables idea), but the fact that we should
emit cant-unwind when we have throw() in C++ and we won’t need forced
unwind, such as release binaries with exception handling.

IIRC, the C++ “throw ()” does not imply “we won’t need forced unwind”. Is there any document (or existing implementation) mandating this? Usually, Clang stops the stack unwinder by catching all exception with landingpad instruction, which is not directly related to nounwind attribute. Thus, IMO, there won’t be any problem to ignore cantunwind in the “uwtable+nounwind” case.

Furthermore, it will be incorrect to emit “cantunwind” in the “uwtable+nounwind” case, since all of the C functions will get nouwind attribute by default. To get the program work with exception, we will add -funwind-tables option to Clang, which will emit uwtable as well. If you emit cantunwind in “uwtable+nounwind” case, then the exception unwinder will always stop at the C function.

I never said I don’t care about backward compatibility, what I said is
that what’s there now is not necessarily correct. This is the
behaviour that the x86_64 ABI demands, but not the ARM ABI, and we
can’t implement the ARM ABI as if it was the x86_64 one.

Sorry for my wordings. I am only trying to propose a possible solution which may break the backward compatibility. I should use the sentence like “If the backward compatibility is not a main concern, …” I apologize if you feel offensive.

  • no attribute => no table generated
  • with nounwind attribute => no table generated
    Wouldn’t the lack of table in a function just break the EH unwind? It
    might be an implementation detail, but I wouldn’t be surprised if the
    personality routine would just bail and crash the program if it
    couldn’t find the previous frame.
    Ie. I don’t think “no tables” == “can’t unwind table”.

IIRC, they will be the same in the libgcc implementation, and it seems to be specified in EHABI. Although, I can’t find the source code at the moment.

Sincerely,
Logan