Union initialization, and aliasing (clang 18 seems to miscompile musl?)

The following code in musl seems to be miscompiled in the latest clang.

I think the author’s intention here is to zero initialize the entire union consisting of sockaddr_in and sockaddr_in6 when the union is initialized with “= {0}” However stepping through this code with GDB, it seems like only the sa.sin part of the union is zero-initialized, and the latter parts of sa.sin6 are garbage.

Later in this code (musl/src/network/res_msend.c at v1.2.4 · bminor/musl · GitHub), sa.sin.sin_family is set, then the union is passed into bind(). bind() then fails because the IPv6 address is garbage.

I’m not sure what the C standard says here. But I would expect specifying {0} as initializer should initialize the whole union, not just the first member? Also, rest of the code never refers to sa.sin6, only sa.sin.family is set. So one could make an argument due to strict aliasing rules, as soon sa.sin is not supposed to alias sa.sin6, so as soon as sa.sin is accessed, sa.sin6 is allowed to be garbage.

What’s the right explanation? Is there a compiler flag to restore the old behavior? I would think that these kinds of things are pretty common in C code bases.

Is this the same bug: c23 aggregate initialization · Issue #78034 · llvm/llvm-project · GitHub

Yeah. Sounds like the same bug.

This part of the C standard is extremely convoluted, but it is my understanding that = {0} only initializes the first union member. The whole union would be initialized by = {} in C23. Prior to that, the correct way to fully initialize a union is to memset it.

This is related to @tstellar’s bug in that clang currently incorrectly treats = {} the same as = {0} and only initializes the first union member in that case as well. But your code as written is (to the best of my knowledge) a bug in musl, not in clang.

1 Like

However, it’s worth noting that the previous behavior should be restored once [clang][CodeGen] Allow memcpy replace with trivial auto var init by antoniofrighetto · Pull Request #84230 · llvm/llvm-project · GitHub lands, in the sense that clang will still treat the union as partially uninitialized, but this doesn’t end up being exploited for optimization in practice. It’s not something you should rely on, as it may “break” again with future optimization changes.

@AaronBallman

I believe it should be fully initialized even with = {0}.

C23 6.7.11p20: The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly are subject to default initialization.

p22: If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate is subject to default initialization.

p11: If an object that has automatic storage duration is not initialized explicitly, its representation is
indeterminate. If an object that has static or thread storage duration is not initialized explicitly, or
any object is initialized with an empty initializer, then it is subject to default initialization, which
initializes an object as follows:
— if it has pointer type, it is initialized to a null pointer;
— if it has decimal floating type, it is initialized to positive zero, and the quantum exponent is implementation-defined;
— if it has arithmetic type, and it does not have decimal floating type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits.

So the uninitialized parts are supposed to be initialized to zero in this case. The same was true in C17, just with different wording.

C17 6.7.9p19: The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not
initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

p21: If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

p10: If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

I think the recursive initialization rules ensure that the rest of the subject for the first member is initialized.

I’m not convinced this is correct. (References below are to C23, but I think the same conclusion holds for earlier standards too.)

First, unions are not aggregates (see definition in 6.2.5 p26), so 6.7.11 p22 does not apply.

The language in 6.7.11 p20 saying “all subobjects that are not initialized explicitly are subject to default initialization” might seem to apply, but I think it should be read in light of the language in the previous two paragraphs that defines which subobjects are relevant.

In particular, 6.7.11 p18 says “When no designations are present, subobjects of the current object are initialized in order according to the type of the current object: array elements in increasing subscript order, structure members in declaration order, and the first named member of a union.” So I believe only the first named member of the union is initialized (explicitly or otherwise) in this situation.

The alternative interpretation that all other members of a union are subject to default initialization also doesn’t make much sense, because in general it would conflict with the explicit initialization of the first named member (or other member specified by a designation, if one was present).

So I think = {0} on a union with automatic storage duration only needs to initialize the first named member.

Requiring it to zero out the rest of the union could be useful, so I could see an argument for doing that (either in a future standard, or as something guaranteed by Clang outside of the standards), but it could also require a lot of extra work if the union as a whole is much larger than the explicitly initialized member. Perhaps it would be better to just tell people to use = {} to get this behavior.

2 Likes

Wow, it never dawned on me to check that definition, but you’re absolutely correct (they’re aggregates in C++ but not in C).

So yes, I agree with you (and @nikic) that only the first member of the union is initialized (but it is fully initialized as an aggregate due to recursive init in p21 causing default initialization, that just doesn’t help the additional storage needed for the other member of the union).

1 Like

What about there bit that says “any padding is initialized to zero bits”? Is the additional storage that’s needed for other member(s) of the union not padding?

In the case in question, the union is explicitly initialized, so the whole paragraph describing “default initialization” doesn’t apply.

2 Likes

some-union = {0} does seem rather likely to want the whole thing zero. In particular, it suggests you want whichever field you’re going to read from first to be zero, and as the initialiser doesn’t say which field that is, setting the whole thing to zero seems sensible.

If we aren’t willing to zero all of it, we could usefully add a warning to sema for when this construct only zeros the first part of the union as the odds on that being user expected behaviour seem slim.