improving test-suite`s FP subtests to be able to compare both exact-match outputs and more-optimized builds that may have different outputs due to FP optimizations

Dear all,

I would like some help, please, with implementing Hal`s excellent suggestion, which I have reworded as below. Hal has confirmed a previous version of my rewording as a correct interpretation. [I made minor changes since then, e.g. for grammar.]

[Abe wrote:]

I think you [Hal] are suggesting something like this:

   1) compile the program with FP fusion off,
      run the program, capture the output and save it,
      hash it and compare it against the reference hash.

   2) if comparison against the reference hash says "not equal",
      fail the test and stop [i.e. stop testing this particular subtest]

   3) compile the program with FP fusion on/"fast", capture the output,
      compare it using "fpcmp" and some positive tolerance against the
      output of the non-fusion build of the same source code;
      fail only if outside the tolerance limit[s]

Is that right?

[Hal wrote:]

Correct.

As of now, I do not understand how to make a single directory and its contained source code compile to more than one program and trigger more than one test run. I think I understand Makefiles well enough to make that happen, but I`m pretty sure I _don`t_ understand either { [1] CMake in general or [2] test-suite`s use of it specifically } well enough to do this without any help. Maybe the "solution" is to have new directories with symlinks to the existing source code?

The help in question could be as minimal as "look here" with a Web link, if the link in question would guide me towards enlightenment. :wink:

Regards,

Abe

Currently the test-suite works by first building the executable and then running them on a set of inputs. Having a multi-step thingy does not fit that.

Having said that you could possibly just build two variants first and use different comparison steps for each. That at least fits the model but is still a precedent and requires some deeper test-suite buildsystem hacking.

- Matthias

From: "Matthias Braun via cfe-dev" <cfe-dev@lists.llvm.org>
To: "Abe Skolnik" <a.skolnik@samsung.com>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>, "cfe-dev" <cfe-dev@lists.llvm.org>
Sent: Thursday, September 29, 2016 6:20:09 PM
Subject: Re: [cfe-dev] [llvm-dev] improving test-suite`s FP subtests to be able to compare both exact-match outputs
and more-optimized builds that may have different outputs due to FP optimizations

>
> Dear all,
>
> I would like some help, please, with implementing Hal`s excellent
> suggestion, which I have reworded as below. Hal has confirmed a
> previous version of my rewording as a correct interpretation. [I
> made minor changes since then, e.g. for grammar.]
>
> [Abe wrote:]
>
>>> I think you [Hal] are suggesting something like this:
>
>>> 1) compile the program with FP fusion off,
>>> run the program, capture the output and save it,
>>> hash it and compare it against the reference hash.
>
>>> 2) if comparison against the reference hash says "not equal",
>>> fail the test and stop [i.e. stop testing this particular
>>> subtest]
>
>>> 3) compile the program with FP fusion on/"fast", capture the
>>> output,
>>> compare it using "fpcmp" and some positive tolerance against
>>> the
>>> output of the non-fusion build of the same source code;
>>> fail only if outside the tolerance limit[s]

Currently the test-suite works by first building the executable and
then running them on a set of inputs. Having a multi-step thingy
does not fit that.

Having said that you could possibly just build two variants first and
use different comparison steps for each. That at least fits the
model but is still a precedent and requires some deeper test-suite
buildsystem hacking.

I think makes sense. We can have the fp-contract-off target and the fp-contract-default target, and use set_target_properties to add -ffp-contract=off to the compile_flags for the former. Then we just need to add the output from the default target as a generated file, and add a dependency on it to the fp-contract-default comparison step. Does that sound right?

Are the buildbots still using the makefile-based system, or are they all on the cmake-based system now?

-Hal

From: “Matthias Braun via cfe-dev” <cfe-dev@lists.llvm.org>
To: “Abe Skolnik” <a.skolnik@samsung.com>
Cc: “llvm-dev” <llvm-dev@lists.llvm.org>, “cfe-dev” <cfe-dev@lists.llvm.org>
Sent: Thursday, September 29, 2016 6:20:09 PM
Subject: Re: [cfe-dev] [llvm-dev] improving test-suite`s FP subtests to be able to compare both exact-match outputs
and more-optimized builds that may have different outputs due to FP optimizations

Dear all,

I would like some help, please, with implementing Hal`s excellent
suggestion, which I have reworded as below. Hal has confirmed a
previous version of my rewording as a correct interpretation. [I made minor changes since then, e.g. for grammar.]

[Abe wrote:]

I think you [Hal] are suggesting something like this:

  1. compile the program with FP fusion off,
    run the program, capture the output and save it,
    hash it and compare it against the reference hash.
  1. if comparison against the reference hash says “not equal”,
    fail the test and stop [i.e. stop testing this particular subtest]
  1. compile the program with FP fusion on/“fast”, capture the
    output,
    compare it using “fpcmp” and some positive tolerance against
    the
    output of the non-fusion build of the same source code;
    fail only if outside the tolerance limit[s]

Currently the test-suite works by first building the executable and
then running them on a set of inputs. Having a multi-step thingy
does not fit that.

Having said that you could possibly just build two variants first and
use different comparison steps for each. That at least fits the
model but is still a precedent and requires some deeper test-suite
buildsystem hacking.

I think makes sense. We can have the fp-contract-off target and the fp-contract-default target, and use set_target_properties to add -ffp-contract=off to the compile_flags for the former. Then we just need to add the output from the default target as a generated file, and add a dependency on it to the fp-contract-default comparison step. Does that sound right?

in a cmake build that would be the right thing to do. The test-suite typically uses a bunch of wrappers above all that so the cmakefiles look more like the old makefiles did… You can probably just duplicate everything in the current beam/CMakeLists.txt and just set a different name for the “PROG” variable and adjust the CFLAGS as wanted. However that needs some careful testing to make sure those two targets are indeed completely independent and don’t override each others file.

Are the buildbots still using the makefile-based system, or are they all on the cmake-based system now?

The scripts in the zorg repository appear to be using “lnt runtest nt” (= makefile system) rather than “lnt runtest test-suite” (=cmake/lit test-suite). So the majority of bots is still on the makefiles I assume :frowning:

  • Matthias

From: "Matthias Braun" <mbraun@apple.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>, "cfe-dev"
<cfe-dev@lists.llvm.org>, "Abe Skolnik" <a.skolnik@samsung.com>
Sent: Thursday, September 29, 2016 6:42:55 PM
Subject: Re: [cfe-dev] [llvm-dev] improving test-suite`s FP subtests
to be able to compare both exact-match outputs and more-optimized
builds that may have different outputs due to FP optimizations

> > From: "Matthias Braun via cfe-dev" < cfe-dev@lists.llvm.org >
>

> > To: "Abe Skolnik" < a.skolnik@samsung.com >
>

> > Cc: "llvm-dev" < llvm-dev@lists.llvm.org >, "cfe-dev" <
> > cfe-dev@lists.llvm.org >
>

> > Sent: Thursday, September 29, 2016 6:20:09 PM
>

> > Subject: Re: [cfe-dev] [llvm-dev] improving test-suite`s FP
> > subtests
> > to be able to compare both exact-match outputs
>

> > and more-optimized builds that may have different outputs due to
> > FP
> > optimizations
>

> > > On Sep 29, 2016, at 3:59 PM, Abe Skolnik <
> > > a.skolnik@samsung.com
> > > >
> >
>

> > > wrote:
> >
>

> > > Dear all,
> >
>

> > > I would like some help, please, with implementing Hal`s
> > > excellent
> >
>

> > > suggestion, which I have reworded as below. Hal has confirmed a
> >
>

> > > previous version of my rewording as a correct interpretation.
> > > [I
> >
>

> > > made minor changes since then, e.g. for grammar.]
> >
>

> > > [Abe wrote:]
> >
>

> > > > > I think you [Hal] are suggesting something like this:
> > > >
> > >
> >
>

> > > > > 1) compile the program with FP fusion off,
> > > >
> > >
> >
>

> > > > > run the program, capture the output and save it,
> > > >
> > >
> >
>

> > > > > hash it and compare it against the reference hash.
> > > >
> > >
> >
>

> > > > > 2) if comparison against the reference hash says "not
> > > > > equal",
> > > >
> > >
> >
>

> > > > > fail the test and stop [i.e. stop testing this particular
> > > >
> > >
> >
>

> > > > > subtest]
> > > >
> > >
> >
>

> > > > > 3) compile the program with FP fusion on/"fast", capture
> > > > > the
> > > >
> > >
> >
>

> > > > > output,
> > > >
> > >
> >
>

> > > > > compare it using "fpcmp" and some positive tolerance
> > > > > against
> > > >
> > >
> >
>

> > > > > the
> > > >
> > >
> >
>

> > > > > output of the non-fusion build of the same source code;
> > > >
> > >
> >
>

> > > > > fail only if outside the tolerance limit[s]
> > > >
> > >
> >
>

> > Currently the test-suite works by first building the executable
> > and
>

> > then running them on a set of inputs. Having a multi-step thingy
>

> > does not fit that.
>

> > Having said that you could possibly just build two variants first
> > and
>

> > use different comparison steps for each. That at least fits the
>

> > model but is still a precedent and requires some deeper
> > test-suite
>

> > buildsystem hacking.
>

> I think makes sense. We can have the fp-contract-off target and the
> fp-contract-default target, and use set_target_properties to add
> -ffp-contract=off to the compile_flags for the former. Then we just
> need to add the output from the default target as a generated file,
> and add a dependency on it to the fp-contract-default comparison
> step. Does that sound right?

in a cmake build that would be the right thing to do. The test-suite
typically uses a bunch of wrappers above all that so the cmakefiles
look more like the old makefiles did... You can probably just
duplicate everything in the current beam/CMakeLists.txt and just set
a different name for the "PROG" variable and adjust the CFLAGS as
wanted. However that needs some careful testing to make sure those
two targets are indeed completely independent and don't override
each others file.

> Are the buildbots still using the makefile-based system, or are
> they
> all on the cmake-based system now?

The scripts in the zorg repository appear to be using "lnt runtest
nt" (= makefile system) rather than "lnt runtest test-suite"
(=cmake/lit test-suite). So the majority of bots is still on the
makefiles I assume :frowning:

Is there any reason we should not start transitioning that now?

-Hal

My current reason to keep the LNT bots in "nt" mode instead of
"test-suite" mode is that I don't have a parser for the new log format
on Zorg.

The "nt" test.log file can tell me what kind of errors happened
without having to see the *whole* output, which is gigantic and takes
ages to load from the build master pages.

I had a task to implement that, but it got into the back-burner and
completely ignored for a number of months.

I'm not sure when I'll be able to resume it, so if anyone is feeling
charitable... :slight_smile:

cheers,
--renato