[PATCH] make dosep.ty test driver multi-threadable

Hi all,

This patch provides support for running the dosep.ty test driver with multiple threads. It speeds up running the full test suite on my HP z620 Ubuntu machine with 32 hyperthreaded CPUs from 11 minutes to about 1m13s (about 9x).

The default behavior is to run single-threaded as before. If the environment variable LLDB_TEST_THREADS is set, a Python work queue is set up with that many worker threads.

To avoid collisions within a test directory where multiple tests make use of the same prebuilt executable, the unit of work for the worker threads is a single directory (that is, all tests within a directory are processed in the normal serial way by a single thread).

I’ve run this way a number of times; the only issue I found was that the TestProcessAttach.py test failed once, when attempting to attach to the process “a.out” by name. I assume this is because some other thread was running an executable of that name at the same time, and we were attempting to attach to the wrong one, so I changed that test to use a different executable name (that change is also included in the attached patch).

There is actually an opportunity to speed the overall test time further (maybe as much as another 1.5x or so) by dividing up a few long-running tests into multiple directories, but I’m not planning to do that now.

This doesn’t yet work with ninja makes, but tfiala or I will look at that as well.

  • Steve

patch-multithreadedTest.txt (5.73 KB)

Steve’s change went in earlier today.

This doesn’t yet work with ninja makes, but tfiala or I will look at that as well.

I did look into this and it actually works fine without modification if you do the ‘ninja check-lldb’ with the environment variable set.

This was the run I just did using 32 procs:

Running multithreaded with 32 threads.

Ran 278 tests.

real 1m22.855s

user 5m38.520s

sys 0m51.520s

That used to take 10+ minutes.

For reference, these are the test times I see on FreeBSD (after
including a PoC patch for llvm.org/pr18894)

1 thread: 443.49 real 214.02 user 63.72 sys
4 threads: 121.17 real 234.17 user 66.27 sys
8 threads: 88.70 real 305.37 user 92.82 sys

I see occasional test failures with higher thread settings (I believe
this was mentioned on the list before). These are the ones that seem
susceptible:

FAIL: LLDB (suite) :: TestCallThatRestarts.py
FAIL: LLDB (suite) :: TestWatchpointConditionCmd.py
FAIL: LLDB (suite) :: TestWatchpointConditionAPI.py

FAIL: LLDB (suite) :: TestCallThatRestarts.py

Glad you’re getting a speedup, Ed. Thanks for posting the times. The call I listed above is the one that I see fail occasionally on my side when running with 32 threads. It would be good if we can figure out what might be exposing that - it might represent a real issue for us if it shows up in higher utilization scenarios (some kind of lock pressure, timeouts, etc.).

Maybe we can mark certain tests with a decorator so they get serially run after all other parallel tests have completed?

Maybe "@serializeTest"?

That’s worth a try.

I don’t yet have any insight into exactly what’s wrong with the one I’m seeing (or the additional failures that Ed is seeing on FreeBSD), so I’m not yet sure if this will address it. But it does seem worth a try.

In the event that your idea works out to eliminate those failures, there is another element we can consider. We can have Python give us the # cores available (if there’s something portable to do that), and when not otherwise specified, pick a reasonable default # threads to get decent test run performance without needing the environment variable specified. This would give anybody running the tests a decent test run speedup without having to read docs on configuring the environment variable. (I think this was Steve Pucci’s idea but definitely something we discussed.)

Yes, we can either do that via python directly, or through SBHostOS, as internally we have:

    static uint32_t Host::GetNumberCPUS ();

Maybe "@serializeTest"?

That's worth a try.

I have a slight worry that this approach could mask real concurrency
issues in LLDB itself; if we know that the issues are only in the test
code it seems more reasonable. Even in that case though tests could
run concurrently on the same machine, e.g. a buildbot host that builds
and tests multiple configurations.

In the event that your idea works out to eliminate those failures, there is another element we can consider. We can have Python give us the # cores available (if there's something portable to do that), and when not otherwise specified, pick a reasonable default # threads to get decent test run performance without needing the environment variable specified. This would give anybody running the tests a decent test run speedup without having to read docs on configuring the environment variable. (I think this was Steve Pucci's idea but definitely something we discussed.)

I was going to make that suggestion too. Python does have a portable
way to get the core count:

import multiprocessing
multiprocessing.cpu_count()

8

Anyhow, explicitly serializing the tests that intermittently fail
would be no worse than today, and worth it to improve the cycle time.

Nice.

The patch below adds a new "--threads" option whose default value is "multiprocessing.cpu_count()". And only if "multiprocessing.cpu_count()" returns 0 will it fall back to using the LLDB_TEST_THREADS environment variable. I am not sure why the environment variable was ever used instead of using an option.

If everyone can try out this patch and make sure it does things as expected:

dosep.patch (1.5 KB)

It gets the correct number of threads by default for me. Looks like
you'll want to clean up the messages about the env variable though:

default:

Running tests with 8 threads
Running multithreaded with 8 threads (from LLDB_TEST_THREADS)

or w/ --threads 1:

Running tests with 1 threads
NOT running multithreaded. Consider setting LLDB_TEST_THREADS
environment variable.

I’ll have a run on my end in a bit here.

I am not sure why the environment variable was ever used instead of using an option.

I think the reason we went this way was that these are the common ways we run tests:

(cmake/ninja):
ninja check-lldb

(configure/make):
make -C tools/lldb/test

I don’t know that it’s entirely clear what the best way is to pass thread arguments to dosep.ty via either of those. Do we want to add --threads to the default instantiations of both of those so it picks up a good default?

I don't know that it's entirely clear what the best way is to pass thread arguments to dosep.ty via either of those. Do we want to add --threads to the default instantiations of both of those so it picks up a good default?

Greg's change defaults to # of CPUs if not explicitly set, so no need
to pass a new argument.

ah ok.

This patch worked for me.

(and ran the right number of threads, which for me is 32).

I’d add this diff to it to clean up output that is now sometimes incorrect:

@@ -56,7 +57,7 @@ def walk_and_invoke(test_root, dotest_options, num_threads):
failed = []
passed = []
if (num_threads > 1):

  • print "Running multithreaded with " + str(num_threads) + " threads (from " + multithreaded_envar + “)”
  • print “Running multithreaded with " + str(num_threads) + " threads.”
    global in_q
    global out_q
    in_q = Queue.Queue()