Making test-suite regression detection easier

Most of the time I’m working on a non-public LLVM target, but sometimes I want to contribute fixes or enhancements back to the core of LLVM or a public target. One thing that I end up having to do each time this happens is that I’ve got to run the test-suite with a vanilla copy of LLVM from SVN to discover which tests are failing so that I can determine that my changes haven’t caused those failures.

Would it be useful for the community to have a website where one can submit the results of a nightly test style run of the test-suite and then have it tell you which tests regressed based upon a recently known “good” run of the nightly test for your target? This would cut the number of tests I have to run in these circumstances in half.

Does anyone else have a simple clever workaround to this issue?

Hi Christopher,

Most of the time I'm working on a non-public LLVM target, but
sometimes I want to contribute fixes or enhancements back to the core
of LLVM or a public target. One thing that I end up having to do each
time this happens is that I've got to run the test-suite with a
vanilla copy of LLVM from SVN to discover which tests are failing so
that I can determine that my changes haven't caused those failures.

Would it be useful for the community to have a website where one can
submit the results of a nightly test style run of the test-suite and
then have it tell you which tests regressed based upon a recently
known "good" run of the nightly test for your target?

The results across machines are rarely comparable. There are just too
many variables (machine, operating system, compiler version, etc.). The
best thing to do is to run a nightly test on your own machine and then
compare that with the test results you get on your "non-public" version.
Doing so will give you the exact delta.

So far we've been doing this manually but I think you're right that its
time to automate it. To support this better what we need is:

1. NightlyTest.pl that can use an already checked out tree
2. Ability to automate the results delta.

Reid.

This would cut the number of tests I have to run in these
circumstances in half.

Actually I don't see how it could. You have to run the "llvm trunk"
baseline to get the current status for your machine. Then you have to
run your "non-public" version to get the delta. As noted above,
comparing to other people's machines just won't be profitable.

Does anyone else have a simple clever workaround to this issue?

Nope. Usually when I do it, I'm focusing in on particular tests. About
the only thing I do en-masses is to grep the nightly test output file
for "TEST-FAIL" and pipe it through sort. Then I do the same for the
code I'm testing. Then if there's a difference in those two files, I
know where my code is failing and I can focus on those particular tests.

Reid.

Would it be useful for the community to have a website where one can
submit the results of a nightly test style run of the test-suite and
then have it tell you which tests regressed based upon a recently
known "good" run of the nightly test for your target?

The results across machines are rarely comparable. There are just too
many variables (machine, operating system, compiler version, etc.). The
best thing to do is to run a nightly test on your own machine and then
compare that with the test results you get on your "non-public" version.
Doing so will give you the exact delta.

So far we've been doing this manually but I think you're right that its
time to automate it. To support this better what we need is:

1. NightlyTest.pl that can use an already checked out tree
2. Ability to automate the results delta.

I believe the NightlyTester can already work on a checked out tree, but you may have to manually update it. That could easily be added.

On a side note, I've been working on a new implementation of our nightly tester results web page. I've designed a new database and I'm in process or revamping the php scripts that are used to view the results. Once I'm done, it should be pretty easy to see regressions between runs. I'll also be adding some supporting stuff so that you can easily create your own database and keep everything internal. The nightly tester supports redirecting the output to another server, but we've never published the database layout.

-Tanya