local test-suite failures on linux

Hi,

I get the following failures when I run the test-suite on linux (Ubuntu 12.04) using LNT (lnt runtest nt ...):

(all are execution failures)
MultiSource/Applications/Burg
MultiSource/Applications/ClamAV
MultiSource/Applications/lemon
MultiSource/Applications/obsequi
MultiSource/Benchmarks/MiBench/automotive-bitcount
MultiSource/Benchmarks/MiBench/telecomm-FFT
MultiSource/Benchmarks/Olden/voronoi
MultiSource/Benchmarks/Ptrdist/anagram
SingleSource/Benchmarks/BenchmarkGame

Everything is built off trunk.

Has anyone else seen these failures and found a fix? Perhaps I'm missing a dependency? There doesn't appear to be a linux machine on llvm.org/perf to compare with either.

paul

Can someone confirm if the test-suite passes 100% on linux?

What is involved in adding a new perf machine?

paul

Hi,

I figured out how to resolve the failures. I noticed that Mountain Lion
includes Bison 2.3 while Ubuntu 12.04 includes Bison 2.5. I installed
Bison 2.3 from source in Ubuntu and the failures went away.

I'm a little concerned that the bison version fixed all the failures I was
seeing. To my knowledge the only failing test that depended on bison was
Burg. It almost looks like one failure can cause unrelated tests to fail.

Any ideas?

paul

There is almost certainly a bug in lnt or the makefiles.

I changed the body of Burg main to the following:

+ printf("Hello World\n");
+ return 0;

I re-ran the test-suite again and got the following errors:

--- Tested: 986 tests --
FAIL: MultiSource/Applications/Burg/burg.execution_time (494 of 986)
FAIL: MultiSource/Applications/ClamAV/clamscan.execution_time (495 of 986)
FAIL: MultiSource/Applications/lemon/lemon.execution_time (496 of 986)
FAIL:
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount.exec
ution_time (497 of 986)
FAIL:
MultiSource/Benchmarks/MiBench/telecomm-FFT/telecomm-fft.execution_time
(498 of 986)
FAIL: MultiSource/Benchmarks/Olden/voronoi/voronoi.execution_time (499 of
986)
FAIL: MultiSource/Benchmarks/Ptrdist/anagram/anagram.execution_time (500
of 986)
FAIL: SingleSource/Benchmarks/BenchmarkGame/puzzle.execution_time (501 of
986)

Notice how the test numbers are consecutive (494-501). They all pass when
Burg passes.

paul

There is almost certainly a bug in lnt or the makefiles.

I changed the body of Burg main to the following:

+ printf("Hello World\n");
+ return 0;

I re-ran the test-suite again and got the following errors:

--- Tested: 986 tests --
FAIL: MultiSource/Applications/Burg/burg.execution_time (494 of 986)
FAIL: MultiSource/Applications/ClamAV/clamscan.execution_time (495 of 986)
FAIL: MultiSource/Applications/lemon/lemon.execution_time (496 of 986)
FAIL:
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount.exec
ution_time (497 of 986)
FAIL:
MultiSource/Benchmarks/MiBench/telecomm-FFT/telecomm-fft.execution_time
(498 of 986)
FAIL: MultiSource/Benchmarks/Olden/voronoi/voronoi.execution_time (499 of
986)
FAIL: MultiSource/Benchmarks/Ptrdist/anagram/anagram.execution_time (500
of 986)
FAIL: SingleSource/Benchmarks/BenchmarkGame/puzzle.execution_time (501 of
986)

Notice how the test numbers are consecutive (494-501). They all pass when
Burg passes.

Yeah, that's disturbing. Bumping this thread to keep some visibility -
I haven't looked into this yet but someone (maybe me, eventually)
should figure this out.

Hi David,

Notice how the test numbers are consecutive (494-501). They all pass when
Burg passes.

Yeah, that's disturbing. Bumping this thread to keep some visibility -
I haven't looked into this yet but someone (maybe me, eventually)
should figure this out.

I think the bison version was misleading. I was attempting to fix the errors on our internal CI setup but changing bison didn't seem to help. I'm still quite perplexed as to what is going on and why the failures originally disappeared.

I am unable to reproduce these results and still have 9 failures on trunk (ubuntu 12.04).

Interestingly, I just ran the test-suite without lnt. Here I get an entirely different set of failures.

predmond@predmond-desktop:~/src/miro/build/release.configure/projects/test-suite$ make TEST=simple report | grep \*
MultiSource/Benchmarks/TSVC/ControlFlow-dbl/ControlFlow-dbl | pass 1.5081 1.5539 * 5.7964 5.8202
MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt | pass 1.3201 1.3698 * 5.1963 5.2269
SingleSource/Benchmarks/Linpack/linpack-pc | pass 0.3160 0.3312 * 85.6734 85.9865
SingleSource/UnitTests/Vector/SSE/sse.expandfft | pass 0.0800 0.1044 * 0.2120 0.2158
SingleSource/UnitTests/Vector/SSE/sse.stepfft | pass 0.1000 0.1304 * 0.4760 0.4823
SingleSource/UnitTests/ms_struct_pack_layout | pass 0.0200 0.0306 * 0.0000 0.0011

paul