Test status (Linux) and run expectation question

Two questions:

  • What is the expectation as to which tests will be run before submission of patches?
  • What is the current status for the Linux tests? Is anyone already looking at bringing the test coverage to parity with Darwin?

I’m seeing a lot of “UNSUPPORTED” tests when I run on Linux and that makes me wonder how much test coverage I’m actually getting if I run those tests before submitting patches (though of the 616 UNSUPPORTED tests on Linux, all but 186 seem to involve dsym, which is probably ok to be unsupported on Linux). So I’m wondering about what the expectations are for test runs before patch submission, and how easily I can meet those expectations if my primary work environment is Linux.

The context is that I’m trying to learn enough about lldb to make real contributions, and I’m doing that by first trying to use it and make it work well as my primary debugger in my usual work environment (Linux targeting chromium).

Thanks in advance for any enlightenment …

– Randy Smith

Hey Randy,

Great questions. See a few threads we had on this a while back when I started looking into this:

http://lists.cs.uiuc.edu/pipermail/lldb-dev/2014-March/003470.html

http://lists.cs.uiuc.edu/pipermail/lldb-dev/2014-March/003474.html

I’m seeing a lot of “UNSUPPORTED” tests when I run on Linux and that makes me wonder how much test coverage I’m actually getting if I run those tests before submitting patches

Roughly half the tests won’t run under Linux on the local test suite. MacOSX has two different mechanisms for handling debug info packaging, so they test a large swath of lldb two ways, one of which isn’t appropriate for Linux. Those will all show as skipped.

Back in March I did a few things on Linux:

  • Went through all the tests that were entirely disabled. I turned them on, and if they made sense to run at all on Linux, I saw if they either always failed or always succeeded. If they did, I turned them on in the normal state (where they are expected to pass). Those that always failed I marked as XFAIL (expected failure). Those that passed intermittently for me I marked as skipped to avoid noise and ensured there was at least one bug tracking them in bugzilla.
  • Enabled code coverage for the local Linux test run. As expected, much of the remote functionality wasn’t covered, but a non-trivial amount of the code was hit by the tests running on Linux. When I eventually get a build bot up, I’m hoping to include a code coverage run for local tests, so we can get an idea of which code is tested by the local test suite. (Soon I will have lldb-platform running for Linux x86-based, in which case we can run the remote suite as well).
    My goal at the time was to find out just how bad the state of the Linux code was w/r/t the MacOSX side. It ended up being not quite as bad as I expected due to some of that info I published.

I have several people on my team now that are working their way through the bug database and will be making a point of correcting the more egregious issues on Linux. We’re working our way towards getting Android working via the lldb-gdbserver (llgs) remote support.

Hope that helps!

Sincerely,
Todd Fiala

I should also add that as of this moment, we have a several tests (5 or 6 on Linux) that fail intermittently on Linux. Some of them are due to the multithreaded test running and attempts by multiple tests to grab resources (like network ports) without any type of mediator. We’ll be addressing these in the near future. I haven’t flipped them to skipped since they tend to run fine in solo mode (running a test at a time), but realistically we need to either fix them for multithreaded or skip them if they’re being intermittent under load/resource-contention issues.

MacOSX has more tests failing right now (last I checked more like 10+). Also, MacOSX tests are not running parallel for some reason we haven’t yet fully diagnosed (but there is a bugzilla bug on it). I went so far as to see the thread queues for running the tests were all waiting on the python subprocess calls appropriately, but only one at a time was actually getting through. We’ll want to get those tests running in parallel so that they’re more agreeable to run - on Linux I can run the test suite in well under 2 minutes, whereas on MacOSX the combination of fewer cores and no multithreading means I take 20 - 45 minutes to run the tests. I’m sure that’s a significant disincentive for running the tests there.

And now to answer both of your actual questions :stuck_out_tongue:

  • What is the expectation as to which tests will be run before submission of patches?

My take is you should run tests for your platform at a bare minimum. So, if you’re developing on Linux, then run ‘ninja check-lldb’ or the equivalent, on your platform, and there should be no new tests showing up in the list of failed tests. (In an ideal world, which we’ll get too soon-ish, there should be no tests failing here, period, but as it stands per earlier in the email thread, there are some failing right now).

Right now this will just run local tests on your platform. I’m okay with having the build bots handle the remote test runs.

If tests break on other platforms after submitting a test, the expectation I have is that the author work to get those resolved quickly or we revert the change until we can work out the right fix/change for the broken platforms (which might include reworking the change that triggered the break).

  • What is the current status for the Linux tests? Is anyone already looking at bringing the test coverage to parity with Darwin?

That’s the bit I was answering with my first two posts.

Awesome; thank you for the full summary, Todd! My take-homes from this are a) always run the full test suite on Linux, and b) don’t worry about the test coverage & already broken tests, as it’s being worked on.

I had thought about taking a detour to work on some of them, but it’s not where I most want to be working in the code base, so I’m glad to have an excuse not to :-}.

– Randy

I had thought about taking a detour to work on some of them, but it’s not where I most want to be working in the code base, so I’m glad to have an excuse not to :-}.

Heh okay.

On a related but broader subject, one thing we haven’t talked about much on the list is the idea of writing tests in general. One thing that I’ve found helpful on many projects is (1) making sure I add tests that cover new code I write, and (2) writing tests that expose issues discovered, ensuring those tests fail before the fix and pass after the fix. Those of you who practice TDD-friendly methodologies will no doubt be familiar with this already, but many of us may be unfamiliar with that approach.

I’d like to suggest that we add tests that exercise our new features so we can make sure we’re aware if we end up breaking them in the future.

If you’re looking to write a test and you’re not sure how to go about it, please feel free to chime up here. I’m sure many of us can give suggestions or help on where to start. It took me a while to figure out how to write the tests, but reading a few of them helped before I tried to write one.

Just some thoughts to chew on.