I thought pointer referencing like this was only valid for arrays. I could be wrong, but it might be that looping over the struct like that
is invalid, making it undefined behavior (and then the hole doesn’t matter because there is no valid way to access it). That said, I’ve definitely
seen a lot of code that uses pointers to reference struct contents.
Anything can be addressed as characters. C99 6.5, see last line:
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:73)
—atype compatible with the effective type of the object,
—aqualified version of a type compatible with the effective type of the object,
—atype that is the signed or unsigned type corresponding to the effective type of the
object,
—atype that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
—anaggregate or union type that includes one of the aforementioned types among its
members (including, recursively,amember of a subaggregate or contained union), or
—a character type.
C89 and C++ have similar language.
Does the test case indicate the why it was added?
Actually it’s testing something else entirely and tripped over this before it got to what it’s supposed to be testing:(
Just assumed it would work, as it does with gcc’s codegen, of course.
More of an implementation observation than a language interpretation, but if you add a “long l[6];” field, llvm-gcc continues to do field-by-field copies, but at l[7] it turns into machine-word copies, then at some point it turns into a rep/movsl (on Intel), and then at another threshold it turns into a memcpy(3) callout.
What part of LLVM’s codegen for copying “struct x { char c; short s; long l[6] };” considers a movb + movw + 6 movl’s to efficient in either time or space (I was using -Os)? What changes when the overall structure gets to 64 bytes such that it decides its more efficient to copy a word at a time?
Yeah, it’s not efficient either. I didn’t want to get into that since fixing the correctness issue, if there is one, will automatically fix this too.
I think the justification is that breaking the struct into fields early makes it easier to do other optimizations.
I think the test case is bogus in terms of language correctness,
Why?
but it might be indicative of a missed optimization for doing structure copies. Is that what GCC’s test case is actually trying to validate? If so, it probably falls under a “gcc test case” and not a “C test case”, if one can differentiate them.
There certainly are “gcc test cases”, but so far I don’t think this is one.