Benchmark LNT weird thread behaviour

Hi James/Chris,

You guys have done this before, so I'm guessing you can help me
understand what's going on.

If my buildbot config is:

jobs=2,
nt_flags=['--cflag', '-mcpu=cortex-a15', '--use-perf', '--threads=1',
'--build-threads=4']

It uses -j4 for build, -j2 for running the tests:

http://buildmaster.tcwglab.linaro.org/builders/clang-native-arm-lnt-perf/builds/35/steps/test-suite/logs/stdio

If my buildbot config is:

jobs=4,
nt_flags=['--cflag', '-mcpu=cortex-a15', '--use-perf', '--threads=1',
'--build-threads=4']

It uses -j4 for build, -j4 for running the tests:

http://buildmaster.tcwglab.linaro.org/builders/clang-native-arm-lnt-perf/builds/33/steps/test-suite/logs/stdio

If my buildbot config is:

jobs=2,
nt_flags=['--cflag', '-mcpu=cortex-a15', '--use-perf', '--threads=1',
'--build-threads=2']

It uses -j2 for build, -j2 for running the tests:

http://buildmaster.tcwglab.linaro.org/builders/clang-native-arm-lnt-perf/builds/34/steps/test-suite/logs/stdio

But, on the production buildbot, my config is:

jobs=2,
nt_flags=['--cflag', '-mcpu=cortex-a15', '--use-perf', '--threads=1',
'--build-threads=2']

and it uses -j2 to build and -j1 to run the tests:

http://lab.llvm.org:8014/builders/clang-native-arm-lnt-perf/builds/98/steps/test-suite/logs/stdio

My master has up-to-date Zorg as of yesterday. The only change I did
was to comment out the unused slaves.

What am I doing wrong?

cheers,
--renato

There seems to be three interesting flags in your LNT invocations:

 -j2 --threads=1 --build-threads=4 

Since -j and --threads are the same flag, I think that is the problem. The first -j flag is at the start of the LNT command, perhaps that is coming from a different part of zorg?

-j2 is from "jobs=2" in the builder's argument list, to compile Clang.
This also gets passed to LNT, since nt_flags is not required, and you
do want to use all cores if nothing is specified.

Of course, when --threads is specified, you have the conflict, but
since this argument is not required, I wouldn't want to omit the -j2
from LNT at all.

I could "fix" this in our builder to not pass -jN into LNT and always
use nt_flags, but I don't think that is a good fix, since other users
will have odd behaviour on their side. Unless that's the kind of
behaviour we *want* to encode, of course.

What do you think?

cheers,
--renato

I think every job should define those or use the LNT default of 1,1. The validity of compile time and exec time metrics is in question if the job is loaded incorrectly, so it makes sense to me to not allow that -j to get passed through.

The job's default *is* -j1. But we pass -jN to the other steps
(compiling Clang, for instance).

We also pass -jN to LNT, because that means both build and execute in one go.

I'll change the buildbots to pass both explicitly in nt_flags, and
will also change the builder to not pass -j in any case.

But if users should not be passing -jN, but instead --threads and
--build-threads directly, than I think we should make it into an error
in LNT, no?

cheers,
--renato

-j and —threads are the same flag. The problem is passing it twice with two different values. I am sort of surprised OptionParser let that past.

Right, so the bots now pass threads directly, so the worst part is over.

Now, being the same thing, I wonder if we just deprecate -j and emit an error, or a warning of which got chosen, or silently prefer - - threads over - j if both are emitted. I don’t think the current behaviour of - j silently overriding - - threads is a good model, though.

Cheers,
Renato

I think it best to just drop -j totally. I don’t know if we can easily convince OptionParser to make passing both an error condition.