updated code size comparison

Hi folks,

I've posted an updated code size comparison between LLVM, GCC, and others here:

   http://embed.cs.utah.edu/embarrassing/

New in this version:

- much larger collection of harvested functions: more than 360,000

- bug fixes and UI improvements

- added the x86 Open64 compiler

John

I started looking through the llvm-gcc vs. clang comparisons, and
noticed that in
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/A9/A9AB5AE7.c
, size_t is declared incorrectly. Any idea how that might have
happened?

-Eli

Hi,

Could you also add a main() for each of these files, and do
a very simple test that the optimized functions actually work?
At least for functions that take only integers and return integers this
could be automated
if you compare -O0 output with the optimized outputs.

The neon_helper.c testcase is clearly misoptimized by gcc-head here:
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/compare_clang-head_gcc-head/compare_23BD1620_disasm.shtml

Try calling it like this:
int main()
{
    printf("%d\n", helper_neon_rshl_s8(0x12345, 15));
    return 0;
}

Prints 74496 here, and not 0 (gcc-head optimized it to a function
returning 0).

Best regards,
--Edwin

I started looking through the llvm-gcc vs. clang comparisons, and
noticed that in
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/A9/A9AB5AE7.c
, size_t is declared incorrectly. Any idea how that might have
happened?

Hi Eli,

Thanks for pointing this out, I'll look into this tonight.

However I can give you the quick generic answer right now (of course you already know it) which is that real C code does just about anything that can be parsed :).

If LLVM warns about this incorrect definition I can eliminate this kind of test case, I'll look into this as well.

John

Hi Torok-

Could you also add a main() for each of these files, and do
a very simple test that the optimized functions actually work?

Unfortunately, testing isolated C functions is much harder than just passing them random data!

Consider this function:

   int foo (int x, int y) { return x+y; }

The behavior of foo() is undefined when x+y overflows. If course it is trivial to come up with similar examples based on shifts, multiplies and divides, etc.

A potential solution is "under-constrained execution":

   http://www.stanford.edu/~engler/issta07v-engler.pdf

I will bug Dawson and Daniel and see if I can get ahold of some code for this.

John

Hi Torok-

Could you also add a main() for each of these files, and do
a very simple test that the optimized functions actually work?

Unfortunately, testing isolated C functions is much harder than just
passing them random data!

Consider this function:

  int foo (int x, int y) { return x+y; }

The behavior of foo() is undefined when x+y overflows. If course it
is trivial to come up with similar examples based on shifts,
multiplies and divides, etc.

Indeed, but can't an analysis find at least one value for each variable
where the behavior is not undefined?
Such a value must exist, or the entire function is useless if it always
has undefined behavior.

Sure, testing on 1 such value (or a random) value won't prove that the
result is correct, but may help finding trivial
miscompilations like the neon_helper case.

Alternatively a testcase could be manually constructed for the top 10
functions in the size comparison charts,
and see whether they are miscompiled. Repeat until top 10 has no
miscompilations.

A potential solution is "under-constrained execution":

  http://www.stanford.edu/~engler/issta07v-engler.pdf

I will bug Dawson and Daniel and see if I can get ahold of some code
for this.

Although EXE isn't, KLEE is publicly available.

Best regards,
--Edwin

Indeed, but can't an analysis find at least one value for each variable
where the behavior is not undefined?
Such a value must exist, or the entire function is useless if it always
has undefined behavior.

Good point :).

Sure, testing on 1 such value (or a random) value won't prove that the
result is correct, but may help finding trivial
miscompilations like the neon_helper case.

Are you absolutely sure it's a miscompilation? I have already shot myself in the foot a couple times on the GCC mailing list or bugzilla by pointing out a bug that turned out to be code with subtle undefined behavior...

Alternatively a testcase could be manually constructed for the top 10
functions in the size comparison charts,
and see whether they are miscompiled. Repeat until top 10 has no
miscompilations.

Tell you what: if I get enough test cases like this, I'll write the test harness supporting it. I don't have time to do this kind of code inspection myself.

There has been talk (I don't remember where) about a Clang option for detecting undefined behavior. Is there any progress on this? This could be used to enable automated random testing.

John

-fcatch-undefined-behavior:
http://clang.llvm.org/docs/UsersManual.html#codegen

Right now it only catches out of range shifts and simple array out of bound issues, not all undefined behavior.

-Chris

Indeed, but can't an analysis find at least one value for each variable
where the behavior is not undefined?
Such a value must exist, or the entire function is useless if it always
has undefined behavior.

Good point :).

Sure, testing on 1 such value (or a random) value won't prove that the
result is correct, but may help finding trivial
miscompilations like the neon_helper case.

Are you absolutely sure it's a miscompilation? I have already shot
myself in the foot a couple times on the GCC mailing list or bugzilla
by pointing out a bug that turned out to be code with subtle undefined
behavior...

Well if it is not then it is a qemu bug, so it is a bug in either case,
you just have to report it to another bugzilla :wink:
The code does conversions by assigning to one union member and reading
from another.
AFAIK that was a GCC language extension, maybe they don't support it in
the latest release, or accidentaly broke it. I don't know.

Someone should reduce a testcase for gcc-head to see exactly what it is
about. My gcc (4.4) doesn't miscompile it.

Either way I'd rather see a warning from gcc when it decides to optimize
the entire function away.

Alternatively a testcase could be manually constructed for the top 10
functions in the size comparison charts,
and see whether they are miscompiled. Repeat until top 10 has no
miscompilations.

Tell you what: if I get enough test cases like this, I'll write the
test harness supporting it. I don't have time to do this kind of code
inspection myself.

Makes sense.

There has been talk (I don't remember where) about a Clang option for
detecting undefined behavior. Is there any progress on this? This
could be used to enable automated random testing.

*Yes, -fcatch-undefined-behavior.
http://clang.llvm.org/docs/UsersManual.html#codegen

*Yes, -fcatch-undefined-behavior.
http://clang.llvm.org/docs/UsersManual.html#codegen

Thanks guys.

My understanding of the situation is that for meaningful automated testing, the protection from undefined behavior has to cover all problems that actually occur in the code under test. But I'll keep checking on this...

John

I started looking through the llvm-gcc vs. clang comparisons, and
noticed that in
http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/A9/A9AB5AE7.c
, size_t is declared incorrectly. Any idea how that might have
happened?

Hi Eli,

Thanks for pointing this out, I'll look into this tonight.

However I can give you the quick generic answer right now (of course you
already know it) which is that real C code does just about anything that can
be parsed :).

Of course, but this looks like the declaration of memset came from a
system header.

If LLVM warns about this incorrect definition I can eliminate this kind of
test case, I'll look into this as well.

clang warns and doesn't treat the usual declaration of memset as the C
library memset if size_t is wrong; gcc apparently doesn't care.

-Eli

Of course, but this looks like the declaration of memset came from a
system header.

Argh, my fault-- I let some files preprocessed on a 64-bit host sneak into the harvesting run. I'll get rid of them for the next run.

John

clang warns and doesn't treat the usual declaration of memset as the C
library memset if size_t is wrong; gcc apparently doesn't care.

Eli-- I looked at this code a bit more closely and it seems to me that (in this particular case, by luck) the gcc strategy of ignoring the problem is OK. Clang wants size_t to be an unsigned int, whereas in these files, size_t is an unsigned long. I can't think of any observable difference between these two types on x86-clang.

Anyway this doesn't form an argument that clang should relax its rules, but it does indicate that gcc is probably not doing anything too silly.

John

Right now it only catches out of range shifts and simple array out of
bound issues, not all undefined behavior.

Besides the obvious memory safety stuff, my list of top undefined behaviors to catch would be:

- multiple updates to objects between sequence points

- integer overflows

- use-after-death of stack variables

- use of uninitialized stack variables

- const/volatile violations

Some of these will be no fun to implement. But the resulting tool would be enormously valuable.

John

, size_t is declared incorrectly. Any idea how that might have
happened?

This is fixed now-- all tests where clang complains about incompatible redeclaration of library function have been thrown out.

John