updated code size comparison

[cross-posting to the GCC and LLVM lists]

I've updated the code size results here:

   http://embed.cs.utah.edu/embarrassing/dec_09/

The changes for this run were:

- delete a number of testcases that contained use of uninitialized local
variables

- turn off frame pointer emission for all compilers

- ask all compilers to target x86 + SSE3

- ask all compilers to not emit stack protector code

- run unix2dos on the .c files so people on Windows don't see all the
lines running together

Hopefully the results are more fair and useful now. Again, feedback is
appreciated.

Once people are happy with how these results are obtained, I'll plan on
just re-running the scripts every few months so we can see how the
compilers evolve. Also there are many possibilities for enhancement
including adding new architectures, harvesting more and larger
functions, and harvesting C++ code.

Thanks,

John Regehr

I would also avoid testcases using volatile. Smaller code on these testcases is often a sign of miscompilation rather than optimization. For example, http://embed.cs.utah.edu/embarrassing/src_harvested_dec_09/076389.c is miscompiled on GCC 3.4 and SunCC 5.10.

Sorry for not noticing yesterday.

Paolo

Hi Paolo,

I would also avoid testcases using volatile. Smaller code on these testcases is often a sign of miscompilation rather than optimization. For example, http://embed.cs.utah.edu/embarrassing/src_harvested_dec_09/076389.c is miscompiled on GCC 3.4 and SunCC 5.10.

Yeah, there are definitely several examples where small code is generated by miscompilation, especially of volatiles.

However I would prefer to leave these testcases in, unless there is a strong feeling that they are too distracting. They serve as poignant little reminders about how easy it is to get volatile wrong...

John

However I would prefer to leave these testcases in, unless there is a
strong feeling that they are too distracting. They serve as poignant
little reminders about how easy it is to get volatile wrong...

They skew the results in favor of the less careful compilers so they are more
than simply distracting, they are unfair.

Yes, that was my point. If you want to make a separate section for volatile,
that would indeed be helpful.

Paolo

Perhaps just leaving out those volatile tescases which are miscompiled on other platforms, since not every volatile testcase fails for all compilers. :slight_smile:

-bw

Yes, that was my point. If you want to make a separate section for volatile, that would indeed be helpful.

I checked and there are about 37,000 harvested functions containing the volatile qualifier. Next time, there will be even more since we'll be harvesting code from the FreeBSD kernel in addition to Linux. It doesn't seem at all clear that it's productive to separate these out. If people are really hating volatile and think it leads to unfair results, I'll probably just #define away volatile next time.

John

Would it be possible for the test cases to detect if they have been
compiled wrongly, for example, check that volatile is actually
working?
How about running the resulting compiled code is a small virtual
machine, that is configured to return a different value each time the
volatile variable is read. The test code could then read the volatile
variable twice and return a different result depending on whether the
two read values are different or not.
If the mis-compiled code only read the value once, having wrongly
optimized out the second read, the test would fail.