Benchmarks for LLVM-generated Binaries

Hi,

I've lately been wondering where benchmarks for LLVM-generated binaries are hosted, and whether they're tracked over time. I'm asking because I'm thinking of where to put some benchmarks I've written using the open source Google benchmarking library [0] to test certain costs of XRay-instrumented binaries, the XRay runtime, and other related measurements (effect of patching/unpatching of various-sized functions, etc.)

As I can certainly publish the numbers I get from the benchmarks, it's not as good as having the benchmarks available somewhere that others can run and verify for themselves (and scrutinise to improve accuracy).

I asked on IRC (#llvm) and Chandler suggested that I ask on the list too.

Questions:

- Is the test-suite repository the right place to put these generated-code benchmarks?
- Are there any objections to using a later version of the Google benchmarking library [0] in the test-suite?
- Are the docs on the Testing Infrastructure Guide still relevant and up-to-date, and is that a good starting point for exploration here?

Cheers

[0] https://github.com/google/benchmark
[1] http://llvm.org/docs/TestingGuide.html

-- Dean

I've lately been wondering where benchmarks for LLVM-generated binaries are hosted, and whether they're tracked over time.

Hi Dean,

Do you mean Perf?

http://llvm.org/perf/

Example, ARM and AArch64 tracking performance at:

http://llvm.org/perf/db_default/v4/nts/machine/41
http://llvm.org/perf/db_default/v4/nts/machine/46

- Is the test-suite repository the right place to put these generated-code benchmarks?

I believe there would be the best place, yes.

- Are there any objections to using a later version of the Google benchmarking library [0] in the test-suite?

While this looks like a very nice tool set, I wonder how we're going
to integrate it.

Checking it out in the test-suite wouldn't be the best option (version
rot), but neither would be requiring people to install it before
running the test-suite, especially if the installation process isn't
as easy as "apt-get install", like all the other dependencies.

- Are the docs on the Testing Infrastructure Guide still relevant and up-to-date, and is that a good starting point for exploration here?

Unfortunately, that's mostly for the "make check" tests, not for the
test-suite. The test-suite execution is covered by LNT's doc
(LNT — LNT 0.4.2.dev0 documentation), but it's mostly about LNT internals and
not the test-suite itself.

However, it's not that hard to understand the test-suite structure. To
add new tests, you just need to find a suitable place { (SingleSource
/ MultiSource) / Benchmarks / YourBench } and copy ( CMakeFiles.txt,
Makefile, lit.local.cfg ), change to your needs, and it should be
done.

cheers,
--renato

I've lately been wondering where benchmarks for LLVM-generated binaries are hosted, and whether they're tracked over time.

Hi Dean,

Do you mean Perf?

http://llvm.org/perf/

Example, ARM and AArch64 tracking performance at:

http://llvm.org/perf/db_default/v4/nts/machine/41
http://llvm.org/perf/db_default/v4/nts/machine/46

Awesome stuff, thanks Renato!

- Is the test-suite repository the right place to put these generated-code benchmarks?

I believe there would be the best place, yes.

- Are there any objections to using a later version of the Google benchmarking library [0] in the test-suite?

While this looks like a very nice tool set, I wonder how we're going
to integrate it.

Checking it out in the test-suite wouldn't be the best option (version
rot), but neither would be requiring people to install it before
running the test-suite, especially if the installation process isn't
as easy as "apt-get install", like all the other dependencies.

I think it should be possible to have a snapshot of it included. I don't know what the licensing implications are (I'm not a lawyer, but I know someone who is -- paging Danny Berlin).

I'm not as concerned about falling behind on versions there though mostly because it should be trivial to update it if we need it. Though like you, I agree this isn't the best way of doing it. :slight_smile:

- Are the docs on the Testing Infrastructure Guide still relevant and up-to-date, and is that a good starting point for exploration here?

Unfortunately, that's mostly for the "make check" tests, not for the
test-suite. The test-suite execution is covered by LNT's doc
(LNT — LNT 0.4.2.dev0 documentation), but it's mostly about LNT internals and
not the test-suite itself.

However, it's not that hard to understand the test-suite structure. To
add new tests, you just need to find a suitable place { (SingleSource
/ MultiSource) / Benchmarks / YourBench } and copy ( CMakeFiles.txt,
Makefile, lit.local.cfg ), change to your needs, and it should be
done.

Thanks -- this doesn't tell me how to run the test though... I could certainly do it by hand (i.e. build the executables and run it) and I suspect I'm not alone in wanting to be able to do this easily through the CMake+Ninja (or other generator) workflow.

Do you know if someone is working on that aspect?

Cheers

-- Dean

I think it should be possible to have a snapshot of it included. I don't know what the licensing implications are (I'm not a lawyer, but I know someone who is -- paging Danny Berlin).

The test-suite has a very large number of licenses (compared to LLVM),
so licensing should be less of a problem there. Though Dan can help
more than I can. :slight_smile:

I'm not as concerned about falling behind on versions there though mostly because it should be trivial to update it if we need it. Though like you, I agree this isn't the best way of doing it. :slight_smile:

If we start using it more (maybe we should, at least for the
benchmarks, I've been long wanting to do something decent there), then
we'd need to add a proper update procedure.

I'm fine with some checkout if it's a stable release, not trunk, as it
would make things a lot easier to update later (patch releases, new
releases, etc).

Thanks -- this doesn't tell me how to run the test though... I could certainly do it by hand (i.e. build the executables and run it) and I suspect I'm not alone in wanting to be able to do this easily through the CMake+Ninja (or other generator) workflow.

Ah, no, that helped you adding your test. :slight_smile:

Do you know if someone is working on that aspect?

http://llvm.org/docs/lnt/quickstart.html

This is *exactly* what Perf (the monitoring website) does, so you're
sure to get the same result on both sides if you run it locally like
that. I do.

You can choose to run down to a specific test/benchmark, so it's quick
and easy to use while developing, too.

cheers,
--renato

I think it should be possible to have a snapshot of it included. I don't know what the licensing implications are (I'm not a lawyer, but I know someone who is -- paging Danny Berlin).

The test-suite has a very large number of licenses (compared to LLVM),
so licensing should be less of a problem there. Though Dan can help
more than I can. :slight_smile:

Cool, let's wait for what Danny thinks on the patch I'll be preparing. :slight_smile:

I'm not as concerned about falling behind on versions there though mostly because it should be trivial to update it if we need it. Though like you, I agree this isn't the best way of doing it. :slight_smile:

If we start using it more (maybe we should, at least for the
benchmarks, I've been long wanting to do something decent there), then
we'd need to add a proper update procedure.

I'm fine with some checkout if it's a stable release, not trunk, as it
would make things a lot easier to update later (patch releases, new
releases, etc).

SGTM.

Thanks -- this doesn't tell me how to run the test though... I could certainly do it by hand (i.e. build the executables and run it) and I suspect I'm not alone in wanting to be able to do this easily through the CMake+Ninja (or other generator) workflow.

Ah, no, that helped you adding your test. :slight_smile:

Do you know if someone is working on that aspect?

Quickstart Guide — LNT 0.4.2.dev0 documentation

This is *exactly* what Perf (the monitoring website) does, so you're
sure to get the same result on both sides if you run it locally like
that. I do.

Ah, cool. That works for me. :slight_smile:

You can choose to run down to a specific test/benchmark, so it's quick
and easy to use while developing, too.

Awesome stuff, thanks Renato!

-- Dean

I've lately been wondering where benchmarks for LLVM-generated binaries are hosted, and whether they're tracked over time.

Hi Dean,

Do you mean Perf?

http://llvm.org/perf/

Example, ARM and AArch64 tracking performance at:

http://llvm.org/perf/db_default/v4/nts/machine/41
http://llvm.org/perf/db_default/v4/nts/machine/46

- Is the test-suite repository the right place to put these generated-code benchmarks?

I believe there would be the best place, yes.

- Are there any objections to using a later version of the Google benchmarking library [0] in the test-suite?

While this looks like a very nice tool set, I wonder how we're going
to integrate it.

Checking it out in the test-suite wouldn't be the best option (version
rot), but neither would be requiring people to install it before
running the test-suite, especially if the installation process isn't
as easy as "apt-get install", like all the other dependencies.

- Are the docs on the Testing Infrastructure Guide still relevant and up-to-date, and is that a good starting point for exploration here?

Unfortunately, that's mostly for the "make check" tests, not for the
test-suite. The test-suite execution is covered by LNT's doc
(LNT — LNT 0.4.2.dev0 documentation), but it's mostly about LNT internals and
not the test-suite itself.

However, it's not that hard to understand the test-suite structure. To
add new tests, you just need to find a suitable place { (SingleSource
/ MultiSource) / Benchmarks / YourBench } and copy ( CMakeFiles.txt,
Makefile, lit.local.cfg ), change to your needs, and it should be
done.

Hi Renato and others,

Is it possible/how hard will it be to make the testsuite kind of extendable? Like, I copy a folder with some tests that comply to some rules (e.g. have CMakeLists.txt, lit.local.cfg, etc.) and run them via the standard infrastructure without changing anything in test-suite files?

The motivation of this question is the following: we (and probably many other companies too) have internal tests that we can’t share, but still want to track. Currently, the process of adding them to the existing test-suite is not clear to me (or at least not very well documented), and while I can figure it out, it would be cool if we can streamline this process.

Ideally, I’d like to see this process like this:
1) Add following files to your benchmark suite:
  1.a) CMakeLists.txt having this and that target doing this and that.
  1.b) lit.local.cfg script having this and that.
  …
2) Make sure the test report results in the following format/provide a wrapper script to convert results to the specified form. /* TODO: Results format is specified here */
3) Run your tests using the standard LNT command, like "lnt runtest … --only-test=External/MyTestSuite/TestA”

If that’s already implemented, then I’ll be glad to help with documentation, and if not, I can try implementing it. What do you think?

Thanks,
Michael

Hi Michael,

You’ll want to look into the externals part of the test suite :slight_smile: It’s how things like SPEC etc are run.

-eric

Hi Eric,

Yeah, I know about Externals and SPEC specifically. But as far as I understand, you have to have kind of description of the tests in test-suite even if you don’t provide the source codes - that’s what I would like to avoid. I.e. you have to have CMakeLists.txt and other files in place all the time, open to everyone.

Now, imagine I have a small testsuite, which probably is not very interesting to anyone else, but extremely interesting for me. AFAIU, to make it a part of LNT now, I have to modify ‘test-suite’ repo so that it’s aware of this new test-suite. I will probably not be able to upstream these changes (as they are only interesting to me), and that’s the source of some inconvenience. I’d be happy to be mistaken here, but that how I understand the current infrastructure, and that where my question came from.

Thanks,
Michael

That’s correct. It seems like a fairly well contained local patch for your local testsuite, otherwise you could probably change test-suite to try to understand arbitrary directories.

I added support for extensions to the test-suite cmake/lit a while ago: In a default cmake run it will look for */CMakeLists.txt and compiles/runs the benchmarks there. Usually that will be External/SingleSource/MultiSource, but if you drop additional directories in your test-suite checkout it should pick those up. You can also manually specify the TEST_SUITE_SUBDIRS variable instead if you do not want to run everything or want it to pick up directories outside the llvm test-suite.

That should be enough to move specialized tests into independent repositories while re-using the running/cmake infrastructure. It would probably be a good idea to this with the External and Bitcode directories today, we just have to follow through and figure out which buildbots need to run what and which additional repositories they need to check out.

  • Matthias

I'm working on this now, and I had a few more questions below for Renato and the list in general. Please see inline below.

I think it should be possible to have a snapshot of it included. I don't know what the licensing implications are (I'm not a lawyer, but I know someone who is -- paging Danny Berlin).

The test-suite has a very large number of licenses (compared to LLVM),
so licensing should be less of a problem there. Though Dan can help
more than I can. :slight_smile:

Cool, let's wait for what Danny thinks on the patch I'll be preparing. :slight_smile:

I'm not as concerned about falling behind on versions there though mostly because it should be trivial to update it if we need it. Though like you, I agree this isn't the best way of doing it. :slight_smile:

If we start using it more (maybe we should, at least for the
benchmarks, I've been long wanting to do something decent there), then
we'd need to add a proper update procedure.

I'm fine with some checkout if it's a stable release, not trunk, as it
would make things a lot easier to update later (patch releases, new
releases, etc).

SGTM.

Is there a preference on where to place the library? I had a look at {SingleSource/MultiSource}/Benchmarks/ and I didn't find a common location for libraries used. I'm tempted to create a top-level "libs" directory that will host common libraries but I'm also fine with just having the benchmark library living alongside the XRay benchmarks.

So two options here:

1) libs/googlebenchmark/
2) MultiSource/Benchmarks/XRay/googlebench/

Thoughts?

Cheers

-- Dean

This is something that may be used (or is intended) to be used by others in the future, the first option makes it easier (or encouraging at least).

I agree.

cheers,
--renato

+1 to this.

Looks like there is reasonably active development going on right now (primarily by EricWF who is also contributing to llvm), so we’ll probably want to coordinate how and how often we sync with top of tree. (Probably more often than google unittests :slight_smile:

-eric

Have you seen the prototype for googlebenchmark integration I did in the past:

https://reviews.llvm.org/D18428 (though probably out of date for todays test-suite)

+1 for copying the googlebechmark into the test-suite.

However I do not think this should simply go into MultiSource: We currently have a number of additional plugins in the lit test runner such as measuring the runtime of the benchmark executable, determining code size, we still plan to add a mode to run benchmarks multiple times, we run the bechmark under perf (or iOS specific tools) to collect performance counters… Many of those are questionable measurements for a googlebenchmark executable which has varying runtime because it runs the test more/less often.
We really should introduce a new benchmarking mode for this.

  • Matthias

Thanks everyone, I'll go with the "libs/" as a top-level directory in test-suite.

Have you seen the prototype for googlebenchmark integration I did in the past:

⚙ D18428 Add googlebenchmark prototype (though probably out of date for todays test-suite)

Not yet, but thanks for the pointer Matthias!

+1 for copying the googlebechmark into the test-suite.

However I do not think this should simply go into MultiSource: We currently have a number of additional plugins in the lit test runner such as measuring the runtime of the benchmark executable, determining code size, we still plan to add a mode to run benchmarks multiple times, we run the bechmark under perf (or iOS specific tools) to collect performance counters… Many of those are questionable measurements for a googlebenchmark executable which has varying runtime because it runs the test more/less often.
We really should introduce a new benchmarking mode for this.

Sounds good to me, but probably something for later down the road.

- Matthias

I'm working on this now, and I had a few more questions below for Renato and the list in general. Please see inline below.

I think it should be possible to have a snapshot of it included. I don't know what the licensing implications are (I'm not a lawyer, but I know someone who is -- paging Danny Berlin).

The test-suite has a very large number of licenses (compared to LLVM),
so licensing should be less of a problem there. Though Dan can help
more than I can. :slight_smile:

Cool, let's wait for what Danny thinks on the patch I'll be preparing. :slight_smile:

I'm not as concerned about falling behind on versions there though mostly because it should be trivial to update it if we need it. Though like you, I agree this isn't the best way of doing it. :slight_smile:

If we start using it more (maybe we should, at least for the
benchmarks, I've been long wanting to do something decent there), then
we'd need to add a proper update procedure.

I'm fine with some checkout if it's a stable release, not trunk, as it
would make things a lot easier to update later (patch releases, new
releases, etc).

SGTM.

Is there a preference on where to place the library? I had a look at {SingleSource/MultiSource}/Benchmarks/ and I didn't find a common location for libraries used. I'm tempted to create a top-level "libs" directory that will host common libraries but I'm also fine with just having the benchmark library living alongside the XRay benchmarks.

So two options here:

1) libs/googlebenchmark/
2) MultiSource/Benchmarks/XRay/googlebench/

This is something that may be used (or is intended) to be used by others in the future, the first option makes it easier (or encouraging at least).

+1 to this.

Looks like there is reasonably active development going on right now (primarily by EricWF who is also contributing to llvm), so we'll probably want to coordinate how and how often we sync with top of tree. (Probably more often than google unittests :slight_smile:

This sounds good to me too. Happy to get involved in ongoing efforts there too.

Cheers

-- Dean

Thanks everyone, I’ll go with the “libs/” as a top-level directory in test-suite.

Have you seen the prototype for googlebenchmark integration I did in the past:

https://reviews.llvm.org/D18428 (though probably out of date for todays test-suite)

Not yet, but thanks for the pointer Matthias!

+1 for copying the googlebechmark into the test-suite.

However I do not think this should simply go into MultiSource: We currently have a number of additional plugins in the lit test runner such as measuring the runtime of the benchmark executable, determining code size, we still plan to add a mode to run benchmarks multiple times, we run the bechmark under perf (or iOS specific tools) to collect performance counters… Many of those are questionable measurements for a googlebenchmark executable which has varying runtime because it runs the test more/less often.
We really should introduce a new benchmarking mode for this.

Sounds good to me, but probably something for later down the road.

Well if you just put googlebenchmark executables into the MultiSource directory then the lit runner will just measure the runtime of the executable which is worse than “for (int i = 0; i < LARGE_NUMBER; ++i) myfunc();” because googlebenchmark will use a varying number of runs depending on noise levels/confidence.
When running googlebenchmarks we should disable the external time measurements and have a lit plugin in place which parses the googlebenchmark output (that old patch has that). I believe this can only really work when you create a new toplevel directory for which we apply different benchmarking rules.

  • Matthias

I think these benchmarks should not run by default as long as there is no proper integration to report the “correct” timing in lit. Otherwise it’ll pollute the reports / database.

Some of the metrics reported by "-mllvm -stats" may be good indicators
of runtime performance.

Aha! Right, I see -- this makes a lot of sense.

I'll experiment a little bit more and get inspiration from what you've already done before.

Thanks very much for the insight and the pointer!

Cheers

-- Dean