Language lawyer question

Looking through the gcc testsuite turned up an interesting edge case. Let’s assume our target leaves a hole for alignment in struct x, as do x86 and powerpc. Do you think the following code can validly abort?

struct x { char c; short s; };
int i; char *p;

memset(&X, 0, sizeof(struct x));
memset(&Y, 22, sizeof(struct x));
X = Y;
for (i=0, p=(char *)&X; i<sizeof(struct x); i++, p++)
if (*p != 22)
abort();

The memset’s and char-by-char comparison are clearly valid references; the questionable bit is the struct copy, which llvm-gcc currently does field-by-field, skipping the hole. C99 says

In simple assignment (=), the value of the right operand is converted to the type of the
assignment expression and replaces the value stored in the object designated by the left
operand.

I would take “replaces the value” to mean “replaces the entire value”, but it could be read otherwise, I suppose.

The current code seems to me to assume holes in structs can’t ever be validly accessed, which isn’t right, as we see here. They are often nondeterministic (this is explicit for initialization in C99) but not always.

I thought pointer referencing like this was only valid for arrays. I could be wrong, but it might be that looping over the struct like that
is invalid, making it undefined behavior (and then the hole doesn’t matter because there is no valid way to access it). That said, I’ve definitely
seen a lot of code that uses pointers to reference struct contents.

Does the test case indicate the why it was added?

More of an implementation observation than a language interpretation, but if you add a “long l[6];” field, llvm-gcc continues to do field-by-field copies, but at l[7] it turns into machine-word copies, then at some point it turns into a rep/movsl (on Intel), and then at another threshold it turns into a memcpy(3) callout.

What part of LLVM’s codegen for copying “struct x { char c; short s; long l[6] };” considers a movb + movw + 6 movl’s to efficient in either time or space (I was using -Os)? What changes when the overall structure gets to 64 bytes such that it decides its more efficient to copy a word at a time?

I think the test case is bogus in terms of language correctness, but it might be indicative of a missed optimization for doing structure copies. Is that what GCC’s test case is actually trying to validate? If so, it probably falls under a “gcc test case” and not a “C test case”, if one can differentiate them.

Maybe it would be reasonable for llvm-gcc to NOT copy the padding at -O0 and do explicit field copies, but to copy the padding as a side effect of an inlined memcpy() implementation for copying sizeof(struct x) when optimization is used. Copying using the largest appropriate registers/instructions given the structure size and alignment seems like it would always be faster than field copies, even for small structures.

Shantonu

I thought pointer referencing like this was only valid for arrays. I could be wrong, but it might be that looping over the struct like that
is invalid, making it undefined behavior (and then the hole doesn’t matter because there is no valid way to access it). That said, I’ve definitely
seen a lot of code that uses pointers to reference struct contents.

Anything can be addressed as characters. C99 6.5, see last line:

An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:73)
—atype compatible with the effective type of the object,
—aqualified version of a type compatible with the effective type of the object,
—atype that is the signed or unsigned type corresponding to the effective type of the
object,
—atype that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
—anaggregate or union type that includes one of the aforementioned types among its
members (including, recursively,amember of a subaggregate or contained union), or
—a character type.

C89 and C++ have similar language.

Does the test case indicate the why it was added?

Actually it’s testing something else entirely and tripped over this before it got to what it’s supposed to be testing:(
Just assumed it would work, as it does with gcc’s codegen, of course.

More of an implementation observation than a language interpretation, but if you add a “long l[6];” field, llvm-gcc continues to do field-by-field copies, but at l[7] it turns into machine-word copies, then at some point it turns into a rep/movsl (on Intel), and then at another threshold it turns into a memcpy(3) callout.

What part of LLVM’s codegen for copying “struct x { char c; short s; long l[6] };” considers a movb + movw + 6 movl’s to efficient in either time or space (I was using -Os)? What changes when the overall structure gets to 64 bytes such that it decides its more efficient to copy a word at a time?

Yeah, it’s not efficient either. I didn’t want to get into that since fixing the correctness issue, if there is one, will automatically fix this too.
I think the justification is that breaking the struct into fields early makes it easier to do other optimizations.

I think the test case is bogus in terms of language correctness,

Why?

but it might be indicative of a missed optimization for doing structure copies. Is that what GCC’s test case is actually trying to validate? If so, it probably falls under a “gcc test case” and not a “C test case”, if one can differentiate them.

There certainly are “gcc test cases”, but so far I don’t think this is one.

I think the test case is bogus in terms of language correctness,

Why?

My gut. I listen to my gut. More seriously, C99 section 6.2.6.1 paragraph 6 has:

When a value is stored in an object of structure or union type, including in a member
object, the bytes of the object representation that correspond to anypadding bytes take
unspecified values.42)

and the footnote says:

  1. Thus, for example, structure assignment need not copyany padding bits.

I think that covers this case for at least C99. The test case should not assume the padding bytes are copied.

If you assume “need not copy” semantics, the test case doesn’t have much value for correctness. If you want LLVM to assume an implementation-specific “must copy” or “must not copy” behavior, you could tweak the test case as needed, I suppose. But under aggressive optimization, you rapidly approach “must copy” semantics, and then the question is why you don’t do that for even the degenerate cases. So again, it turns into an optimization test case.

Shantonu

Hi,

Looking through the gcc testsuite turned up an interesting edge case.
Let's assume our target leaves a hole for alignment in struct x, as do
x86 and powerpc. Do you think the following code can validly abort?

I think you should ask this on the gcc mailing list. For what it's
worth I'm sure that Ada does not require the values in padding to
be preserved.

The current code seems to me to assume holes in structs can't ever be
validly accessed, which isn't right, as we see here.

Of course they can be accessed, but that doesn't in itself mean that
what they contain needs to be preserved by structure copies.

Ciao,

Duncan.

I think the test case is bogus in terms of language correctness,

Why?

My gut. I listen to my gut. More seriously, C99 section 6.2.6.1 paragraph 6 has:

When a value is stored in an object of structure or union type, including in a member
object, the bytes of the object representation that correspond to anypadding bytes take
unspecified values.42)

and the footnote says:

  1. Thus, for example, structure assignment need not copyany padding bits.

Ah, thanks. I thought that was probably the intent but couldn’t find it. That is definitive.

Padding is allowed to be skipped in structures in this case.
See 6.2.6.1.
Even further, all padding is allowed to take any value no matter how
you try to set it (IE it always allowed to have an undefined value,
even if you memset it).

We happen to allow memset to clear padding bits, but we don't have to, AFAIK.

--Dan

Yes, 6.2.6.1 is clear in C99, thanks. (C90 does not seem to have a corresponding section, though.)

This is derived from gcc.dg/vmx/varargs-4.c from the gcc testsuite, if anybody feels like fixing it.

No. The value of the object referred to be the left hand side must be replaced by the object on the right. The objects are defined to be a sequence of sizeof(x) bytes (I'm assuming the testcase does struct x X, Y;). If a byte isn't so updated, trivially, the object has not been replaced.

Doh, you know, I was trying to find the quoted wording... What the big print giveth, the small print takes away. Ignore me... I was wrong.

Except that the standard specifically says the value of the padding
bytes are undefined, and that you can do structure copies by member.

Sadly, a lot of these would be easier to read if they were written
with exceptions first, then "except as above, blah blah blah"

You're assuming the structure you're copying both from and to exist in memory. What happens if the compiler decided to place one of them in registers, such that a structure like:

struct example {
   short x;
   double y;
}

is stored in two registers, with no padding?

Brian