Here is our proposal to extend/enhance the x86-64 compact unwind
descriptors to fully describe the prologue/epilogue for asynchronous
unwinding. I believe there are missing/lacking CFI directives as well,
but I'll save that for another thread.
Asynchronous Compact Unwind Descriptors
Ron Brender, VMS Software, Inc.
Revised January 25, 2018
1 Introduction
This document proposes means to extend so-called compact unwind
descriptors to support fully
asynchronous exception handling. This will make extended compact unwind
descriptors an
alternative to DWARF CFI (call frame information) for achieving
asynchronous exception
support.
Compact unwind descriptors can and have been used in both 64- and 32-bit
environments.
However, this proposal addresses only 64-bit environments. While the
ideas presented here can
be readily adapted for use in a 32-bit environment, for simplicity this
document makes no
attempt to do so.
There are generally three kinds of information that together form the
heart of modern software
exception handling systems:
1. Information that is used to divide the remaining unwind information
into groups that are
specific to particular regions of memory (often, but not
necessarily, associated with a
single function) as well as provide a way to efficiently search for
and identify the
grouping that is associated with a particular address in memory.
2. Information that can be used to virtually or actually unwind from
the call frame of an
executing function to the call frame of its caller (at the point of
the call).
3. Identification of an associated personality routine that is invoked
by the general exception
handling mechanization to guide how processing should proceed for a
given function, as
well as additional “language specific data” needed for the
personality routine to do its
job. Note that the personality routine and its related data are
specified as an adjunct to
compiled code and totally opaque to the general mechanism (other
than the specified
interface).
Note in particular that C++ exception handling is built on top of a
personality routine and
language specific data area ABI that itself can be implemented using
either DWARE CFI or
extended compact unwind information as described here. The choice
between the two is
transparent to C++.
2 Compact Unwind Overview
This section provides a brief overview of key features of the LLVM
compact unwind design. It
does not attempt a comprehensive re-statement of all aspects of the
design except to the extent
necessary to motivate and understand later proposed changes and
enhancements.
2.1 Compact Unwind Group Description
A compact unwind group consist of five fields, as follows:
63 32 31 0 |
-------------------------------------------------------------------|
STARTING-ADDRESS |
LENGTH | COMPACT-UNWIND-DESCRIPTION |
PERSONALITY-FUNCTION-POINTER |
LANGUAGE-SPECIFIC-DATA-ADDRESS |
-------------------------------------------------------------------|
STARTING-ADDRESS (64-bits) is the lowest address of a region of memory
occupied by some
code, typically the entry point of a function.
LENGTH (32-bits) is the number of bytes included in this group,
typically including all and only
the code of a function.
COMPACT-UNWIND-DESCRIPTION (32-bits) is a description of the fully
formed frame of a
function and how to unwind it. This is described further following.
PERSONALITY-FUNCTION-POINTER (64-bit) is a pointer to the personality
routine.
LANGUAGE-SPECIFIC-DATA-ADDRESS (64-bits, sometimes abbreviated LSDA) is a
pointer to some data to be passed to the personality routine when it is
called.
A key observation is that the starting address plus length way of
describing a group means that
the set of groups for a compilation unit need not describe all of the
code in that unit. In
particular, it appears to be expected that no unwind information need be
generated for leaf
functions.
On the other hand, it is reasonable to expect that the groups that are
emitted are ordered by the
starting address. This means that a simple and fast binary search can be
used to map an address
to the group that applies to that address, if any.
It is useful to note that the run-time representation of unwind
information can vary from little
more than a simple concatenation of the compile-time information to a
substantial rewriting of
unwind information by the linker. The proposal favors simple
concatenation while maintaining
the same ordering of groups as their associated code.
2.2 Compact Unwind Frame Description
A compact unwind frame description describes a frame in sufficient
detail to be able to unwind
that frame to the frame of its caller.
31 28 27 24 23 0 |
-------------------------------------------------------------------|
FLAGS | MODE | |
-------------------------------------------------------------------|
At the top most level, there are four bits that are not of further
interest here. Interpretation of
these bits is neither used nor changed.
Also at the top-level is a 4-bit mode field. This is the tag of a
discriminated (tagged, variant)
union that selects the interpretation of the remaining 24 bits.
Of the 16 possible modes, only 5 are defined :
Code Meaning Description
0 Old “Old” is presumed to refer to some historical
design that is no longer of interest.
It is treated here as Reserved.
1 RBP-based The frame uses the RBP register as a frame pointer.
The size of the frame can
frame vary during execution.
2 RSP-based The frame uses RSP as the frame pointer. The size
of the frame is fixed (at
frame compilation time).
3 Large RSP- The frame uses RSP as the frame pointer, The size
of the frame is fixed (at
based frame compilation time); however, that size is too large
to express within this 32-bit
descriptor encoding.
4 DWARF The frame, for whatever reason, cannot be
adequately described using the
escape compact unwind frame description. The remaining
24-bits are an index into
what the DWARF standard calls the .debug_frame
section (__eh_frame in
LLVM).
2.2.1 RBP-based Frame (MODE=1)
For a RBP-based frame, the remaining 24 bits are encoded as follows:
23 16 | 15 | 14 0 |
-------------------------------------------------------------------|
OFFSET | 0 | REGS |
-------------------------------------------------------------------|
In a RBP-based frame the RBP register is pushed on the stack immediately
after the return
address, then RSP is moved to RBP. To unwind, RSP is restored with the
current RPB value,
then RBP is restored by popping off the stack, and the return is done by
popping the stack once
more into the instruction pointer.
All preserved registers are saved in a small range in the stack that
starts at RBP-8 to RBP-2040.
The offset/8 relative to RBP is encoded in the 8-bit OFFSET field. The
registers saved are
encoded in the 15-bit REGS field as five 3-bit entries.
2.2.2 RSP-Based Frame (MODE=2)
For a RSP-based frame, the remaining 24 bits are encoded as follows:
23 16 | 15 13 | 12 10 | 9 0 |
-------------------------------------------------------------------|
SIZE | | CNT | REG_PERM |
-------------------------------------------------------------------|
In a RSP-based frame the stack pointer serves directly as the frame
pointer and RBP is available
for use as a general register. Upon entry, the stack pointer is
decremented by 8*SIZE bytes (the
maximum stack allocation is thus 2040 bytes). To unwind, the stack size
is added to the stack
pointer, and completed by popping the stack once more into the
instruction pointer.
All preserved registers are saved on the stack immediately after the
return address. The number
of registers saved (up to 6) is encoded in the 3-bit CNT field. The
11-bit REG_PERM field
encodes which registers were saved and in what order.
2.2.3 Large RSP-Based Frame (MODE=3)
For a large RSP-based frame, the remaining 24 bits are encoded as follows:
23 16 | 15 13 | 12 10 | 9 0 |
-------------------------------------------------------------------|
SIZE | ADJ | CNT | REG_PERM |
-------------------------------------------------------------------|
This case is like the previous, except the stack size is too large to
encode in the compact unwind
encoding. Instead, the function must include a "subq $nnnnnnnn, RSP"
instruction in its
prologue to allocate the stack. The offset from the entry point of the
function to the nnnnnnnn
value in the function is given in the SIZE field.
Depending on the exact instructions used to save registers (PUSH versus
MOV), the nnnnnnnn
value in the instruction stream may not be quite the full stack size.
ADJ * 8 is the additional
adjustment needed to get the actual size.
2.2.4 DWARF Escape (MODE=4)
The frame, for whatever reason, cannot be adequately described using a
compact unwind frame
description. The remaining 24-bits are an index into what the DWARF
standard calls the
.debug_frame section (called __eh_frame in LLVM).
3 Asynchronous Changes and Enhancements
It is immediately obvious that omission of unwind information for leaf
functions (with any kind
of frame) precludes handling an exception that might occur during its
execution. It follows that
unwind information must cover all of the code of a module (with one
exception discussed
below). But if successive unwind groups are ordered (as previously
assumed) and also leave no
gaps, then the LENGTH field is redundant and can be omitted. The
beginning address of a
following group is always one byte past the end of the predecessor
group. There remains only the
question of how to identify the last group of a set.
It should also be clear that the unwind representation described in the
prior section is not
sufficient to unwind from an asynchronous exception that might occur in
either the prologue or
epilogue of a function. To see this consider what would happen if an
exception occurred at either
the entry point or the return instruction of either a RBP- or RSP-frame
function. To be able to
handle asynchronous exceptions at any point during function execution,
it is necessary to add
additional information to each unwind group.
These two considerations can be combined. The result is simply to
repurpose the LENGTH field
to encode prologue and epilogue information.
3.1 Extended MODEs
To preserve backward compatibility and to allow intermixing of
traditional and extended
compact unwind groups, new MODEs are defined as follows:
Code Meaning Description