LLDB’s expression evaluator uses Clang to compile user-provided C++ expressions. It does so by reconstructing an AST from DWARF and passing it to Clang for parsing and codegen. The codegen phase, particularly when we lower the AST into LLVM types (in CGRecordLowering
), needs to know about type layouts. Clang allows an ExternalASTSource
to provide the layout for a type (LLDB implements this interface in ClangASTImporter::LayoutRecordType
). This suffices for a large number of cases but occasionally Clang still relies on the structure of the AST to derive layout information. Clang is sensitive to incorrect layout calculations, and violating any invariants in this phase manifest in assertions and crashes. Because LLDB doesn’t always accurately reconstruct ASTs from DWARF, there are currently several C++ idioms that cause LLDB to crash in Clang’s AST lowering phase. This post focuses on the known failure modes when debugging code with [[no_unique_address]]
(NUA) annotations, discusses the solutions previously attempted and outlines a way forward.
This issue has been brought up multiple times over the years [1][2] but hasn’t been resolved yet. Now this topic is being revived because it’s going to be a blocker for [libc++] Replace __compressed_pair
with [[no_unique_address]]
.
Let me know if you have any questions/suggestions/concerns!
Failure Modes
Scenario 1: basic [[no_unique_address]] usage
Details
This is a reduced example of the crashes we would see in the LLDB test-suite after applying [libc++] Replace __compressed_pair
with [[no_unique_address]]
. It’s the very basic use-case for NUA.
struct Empty {};
struct Foo {
int* m_begin = 0;
[[no_unique_address]] Empty m_alloc;
};
Clang produces following DWARF for this:
DW_TAG_structure_type
DW_AT_name ("Foo")
DW_AT_byte_size (0x08)
DW_TAG_member
DW_AT_name ("m_begin")
DW_AT_type (0x0000000000000087 "int *")
DW_AT_data_member_location (0x00)
DW_TAG_member
DW_AT_name ("m_alloc")
DW_AT_type (0x0000000000000093 "Empty")
DW_AT_data_member_location (0x00)
Observe how m_begin
and m_alloc
were folded into the same DW_AT_member_location
.
This triggers following assertion when debugging with LLDB:
(lldb) run
Process 45607 launched: 'nua-libcxx-repro.out' (arm64)
Process 45607 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100003f4c nua-libcxx-repro.out`main at libcxx-repro.cpp:16:5
13
14 int main() {
15 Foo f;
-> 16 return 0;
17 }
(lldb) expr f
Assertion failed: (Prior->FD->hasAttr<NoUniqueAddressAttr>() && "should not have reused this field's tail padding"), function clipTailPadding, file CGRecordLayoutBuilder.cpp, line 760.
Here we are at the point in clipTailPadding
after we’ve placed m_begin
at offset 0 and expect the next member to be placed at an offset of 8 bytes. But Clang placed it at 0, which can only happen with a NoUniqueAddressAttr
on the field, which we don’t have in LLDB’s AST.
Scenario 2: [[no_unique_address]] for empty base-class optimisation
Details
struct Empty {};
struct A
{
long c,d;
};
struct B
{
[[no_unique_address]] Empty x;
};
struct C
{
[[no_unique_address]] Empty x;
};
struct Foo : B, A, C {};
This generates following DWARF:
DW_TAG_structure_type
DW_AT_name ("A")
DW_AT_byte_size (0x10)
DW_TAG_member
DW_AT_name ("c")
DW_AT_type (0x0000000000000069 "long")
DW_AT_data_member_location (0x00)
DW_TAG_member
DW_AT_name ("d")
DW_AT_type (0x0000000000000069 "long")
DW_AT_data_member_location (0x08)
NULL
DW_TAG_structure_type
DW_AT_name ("B")
DW_AT_byte_size (0x01)
DW_TAG_member
DW_AT_name ("x")
DW_AT_type (0x000000000000009b "Empty")
DW_AT_data_member_location (0x00)
NULL
DW_TAG_structure_type
DW_AT_name ("C")
DW_AT_byte_size (0x01)
DW_TAG_member
DW_AT_name ("x")
DW_AT_type (0x000000000000009b "Empty")
DW_AT_data_member_location (0x00)
NULL
DW_TAG_structure_type
DW_AT_name ("Empty")
DW_AT_byte_size (0x01)
DW_TAG_structure_type
DW_AT_name ("Foo")
DW_AT_byte_size (0x10)
DW_TAG_inheritance
DW_AT_type (0x0000000000000085 "B")
DW_AT_data_member_location (0x00)
DW_TAG_inheritance
DW_AT_type (0x0000000000000047 "A")
DW_AT_data_member_location (0x00)
DW_TAG_inheritance
DW_AT_type (0x00000000000000b9 "C")
DW_AT_data_member_location (0x10)
NULL
Few things to note:
- The
DW_AT_byte_size
ofB
andC
are0x1
. I.e., the fact their use may change the size to zero when used as bases is not reflected here (this “maybe zero” property of aRecordDecl
is query-able asisZeroSize
in the AST) - The
DW_TAG_inheritance
members forA
andB
ofFoo
share aDW_AT_data_member_location
, here0x0
, andC
is placed at the0x10
, i.e., aftersizeof(A)
. This makes sense since[[no_unique_address]]
will only allow overlap of distinct types (i.e., if we changeC
to contain a different empty type, all ofA
,B
,C
would fold into a single location. However, this placement depends on the order of the inheritance. If we declareFoo
as:
struct Foo : B, C, A {};
…we get following DWARF instead:
DW_TAG_inheritance
DW_AT_type (0x0000000000000085 "B")
DW_AT_data_member_location (0x00)
DW_TAG_inheritance
DW_AT_type (0x00000000000000b9 "C")
DW_AT_data_member_location (0x01)
DW_TAG_inheritance
DW_AT_type (0x0000000000000047 "A")
DW_AT_data_member_location (0x00)
When we then try to lay out this type in LLDB, we run into following clang assertion:
(lldb) run
Process 59690 launched: 'nua-bases.out' (arm64)
Process 59690 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100003fa0 nua-bases.out`main at nua-bases.cpp:27:5
24 Foo f;
25
26 int main() {
→ 27 return 0;
28 }
29
(lldb) expr f
Assertion failed: (Prior→Kind == MemberInfo::Field && "Only storage fields have tail padding!"), function clipTailPadding, file CGRecordLayoutBuilder.cpp, line 754.
This crash originates all the way in Decl::addedMember
when called on the field x
of B
and C
, where we set isZeroSize to false (because NoUniqueAdress
isn’t attached to x
). This means we set the Empty
bit on the fields definition to false
too. Then in CGRecordLowering::accumulateBases
, we incorrectly add empty base classes to the list of Members we iterateduring the clipTailPadding
phase. This confuses the layout builder when it sees that multiple bases have been folded into the same offset, resulting in the assertion above.
Scenario 3: tail padding re-use
Details
The [[no_unique_address]]
on non-empty fields can also affect codegen. Here’s what cppreference has to say:
If the member is not empty, any tail padding in it may be also reused to store other data members.
Clang doesn’t currently perform such optimization (see GH#50766), but GCC does (see Godbolt). Because LLDB gets used to debug non-Clang binaries, and because Clang may implement this in the future, LLDB should be able to handle this case too.
Take this example:
struct A {
[[no_unique_address]] double x;
char y;
};
struct B {
[[no_unique_address]] A a;
int z;
};
struct C {
A a1;
B b1;
};
A a;
B b;
C c;
static_assert(sizeof(a) == 16);
static_assert(sizeof(b) == 16);
GCC will generate following DWARF:
DW_TAG_structure_type
DW_AT_name ("A")
DW_AT_byte_size (0x10)
DW_TAG_member
DW_AT_name ("x")
DW_AT_type (0x0000000000000051 "double")
DW_AT_data_member_location (0x00)
DW_TAG_member
DW_AT_name ("y")
DW_AT_type (0x0000000000000058 "char")
DW_AT_data_member_location (0x08)
NULL
DW_TAG_structure_type
DW_AT_name ("B")
DW_AT_byte_size (0x10)
DW_TAG_member
DW_AT_name ("a")
DW_AT_type (0x000000000000002d "A")
DW_AT_data_member_location (0x00)
DW_TAG_member
DW_AT_name ("z")
DW_AT_type (0x0000000000000083 "int")
DW_AT_data_member_location (0x0c)
NULL
DW_TAG_structure_type
DW_AT_name ("C")
DW_AT_byte_size (0x20)
DW_TAG_member
DW_AT_name ("a1")
DW_AT_type (0x000000000000002d "A")
DW_AT_data_member_location (0x00)
DW_TAG_member
DW_AT_name ("b1")
DW_AT_type (0x000000000000005f "B")
DW_AT_data_member_location (0x10)
NULL
Observe the following:
- The
DW_AT_byte_size
ofA
andB
are both0x10
. This is because GCC was able to foldz
into 4 bytes ofA
‘s tail padding, placing thez
at0x0c
.
This crashes LLDB as follows:
(lldb) run
Process 78192 launched: 'nua-tail-padding-gcc.out' (arm64)
Process 78192 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100003fb4 nua-tail-padding-gcc.out`main at nua-tail-padding-gcc.cpp:26:12
23
24 int main() {
25 C c;
→ 26 return 0;
27 }
28
(lldb) expr c
Assertion failed: (Prior→FD→hasAttr<NoUniqueAddressAttr>() && "should not have reused this field's tail padding"), function clipTailPadding, file CGRecordLayoutBuilder.cpp, line 760.
We are at the point in clipTailPadding
where we see that z
was placed at a 12 byte offset, within Tail
, and Clang knows that the only way this can happen is with a NoUniqueAddressAttr
on the field, which we don’t have in LLDB’s AST (this is pretty much the same issue as in Scenario 1 just based on different kinds of offset problems).
Previous Solutions
In my opinion any solution to this should make sure we attach the NoUniqueAddressAttr
to decls that LLDB creates, since we assert on its existence during CGRecordLowering
and it’s currently checked in various places around Clang (and can possible decide more things in the future).
Solution 1: Emit 0 DW_AT_byte_size
for empty [[no_unique_address]]
fields
This was originally raised as a possible solution in D101237 and further discussed on the DWARF mailing list. The idea here is that we can signal to LLDB whether a field is marked NUA by setting its DW_AT_byte_size
to 0. This is allowed by the DWARF spec Section 5.7.6 Data Member Entries
:
If the size of a data member is not the same as the size of the type given for the data member, the data member has either a DW_AT_byte_size or a DW_AT_bit_size attribute whose integer constant value is the amount of storage needed to hold a value of the type.
This would also give us an already standard compliant way to support GCC-produced binaries (assuming GCC implements this).
Implementation
The prototype in D101237 added an isZeroSize
flag to the DIDerivedType
, which is set to the value of FieldDecl::isZeroSize
API. Then during DWARF emission we would emit a DW_AT_byte_size 0x0
if the flag was set.
Drawbacks
This only works for NUA on fields with empty types. If a field’s type is not empty, isZeroSize
doesn’t apply. Meaning this would fix scenarios (1) and (2) but doesn’t handle (3).
Solution 2: Derive attribute during LLDB DWARF parsing
Initially attempted in ⚙ D143347 [lldb][DWARF] Infer no_unique_address attribute. The implementation quickly gets unwieldy. Take the example in Scenario 3 :
struct A {
[[no_unique_address]] double x;
char y;
};
struct B {
[[no_unique_address]] A a;
int z;
};
struct C {
A a1;
B b1;
};
C c;
The order in which LLDB parses c
is:
c
→A
→x
→y
→b1
→B
→a
→z
Here, we only realise that there has been any overlap by the time we parse z
. We could deduce that since z
‘s offset is within sizeof(a)
from a
’s offset, that a
has a [[no_unique_address]]
attribute. But it’s unclear to me how we would backtrack and deduce the existence of [[no_unique_address]]
on x
. But maybe that is enough for LLDB to avoid any critical invariant violations? Are there cases where these types of optimisations kick in without [[no_unique_address]]
?
Solution 3: Introduce a new DWARF attribute
Discussed as a viable alternative in ⚙ D101237 [lldb] Fix [[no_unique_address]] and libstdc++ 11's std::unique_ptr. This would be the most straightforward solution at the cost of increasing debug-info and having to standardise a potentially overly specific attribute. However, there could be room for a more general type of attribute that could aid with following related layout-related LLDB crash: deriving tail-padding re-use from DWARF.
Solution 4: Mark ALL decls LLDB creates as [[no_unique_address]]
Discussed as alternatives in ⚙ D101237 [lldb] Fix [[no_unique_address]] and libstdc++ 11's std::unique_ptr and ⚙ D143347 [lldb][DWARF] Infer no_unique_address attribute, but both times it seemed like a rather drastic solution. Given the attribute is check throughout various parts of Clang and CodeGen this has the potential of introducing hard-to-diagnose problems in the future. In my opinion we should try and keep the LLDB-constructed AST as closely aligned to the source AST as possible.
Prior Art
As far as I can tell, GCC generates the same kind of DWARF that Clang does for no_unique_address
annotations. I haven’t checked how GDB handles the above examples. Since it doesn’t try to reconstruct the AST like LLDB does, it probably isn’t as much of an issue for them. I found one bug report around NUA handling in GDB here, the description of which suggests it tries to handle overlapping of fields when it parses DWARF (which LLDB currently doesn’t).
Summary
LLDB provides Clang with layout information for record types based on the offsets and sizes parsed out of DWARF. In some cases, however, Clang will derive additional layout information from the structure of the AST. This discrepancy can result in layout invariant violations if LLDB doesn’t create the necessary AST constructs. This RFC discussed how we could create [[no_unique_address]]
attributes for ASTs constructed by LLDB.
I see three potential ways forward:
- A mix of Solution 1 and 3. Emit
DW_AT_byte_size 0x0
for fields that are empty. This won’t address all NUA usages, but does address the ones that Clang actually supports. We can add a new attribute for the potential tail padding re-use case, which could help address another non-NUA layout issue. - Implement Solution 3 only.
- Implement Solution 2 only. Fixes this purely in LLDB by deriving the attribute when parsing fields and base classes. As far as I can tell, this would have to be a best-effort algorithm because some cases aren’t derivable purely based on the DWARF offsets. Meaning the LLDB AST wouldn’t necessarily always match what we had in the source.
My preference would be (1) or (2) above, but happy to hear other suggestions/concerns.
(CCing people who might be interested @dblaikie @adrian.prantl @pogo59 @clayborg)