[lit] RFC: Per test timeout

Hi,

A feature I've wanted in lit for a while is a having a timeout per
test. Attached
are patches that implement this idea.

I'm e-mailing llvm-dev rather than llvm-commits
because I want to gather more feedback on my initial implementation and
hopefully some answers to some unresolved issues with my implementation.

Currently in lit you can set a global timeout for
all of the tests but not for each individual test.

The attached patches add

* Support for a new ResultCode called TIMEOUT
* A new command line option --max-individual-test-time
* Support for running external and internal ShTests with a per test timeout
* Support for running GTests with a per test timeout

I wanted to get some initial feedback on the implementation.

* If a timeout is requested the Python psutil module is required.
  This module does not ship with Python but is availble via pip
  (or on Linux your distribution's package manager). How do people feel
  about this? I don't like adding extra dependencies but this module
  makes it really easy to kill a process and all its children recursively
  in a platform neutral way. Note that if a per test timeout is not requested
  then the psutil module is not imported and lit acts just like it did
before my patches.

* If the platform running lit doesn't have psutil installed and a
timeout is requested
  an exception will be thrown. Should we provide a more friendly error message?
  If so where should this go in lit's code?

* I've not tested these patches on OSX or Windows. Is anyone on those
platforms willing
  to give them a try?

* The behaviour of --max-individual-test-time is a little at odds with
the behaviour of --max-time. --max-time will mark unexecuted tests as
UNRESOLVED whereas --max-individual-test-time will mark a test that
ran out of time as TIMEOUT. Is this okay?

* @Chris Matthews. Does the code that emits xunit xml files need to
change in anyway?

* @Daniel Dunbar. If these changes (in some form) end up being
committed would you be happy
  to push a new release of lit to PyPy?

0001-lit-Partially-implement-support-of-per-test-timeout-.patch (11 KB)

0002-lit-Implement-per-test-timeout-when-using-a-ShTest-a.patch (7.42 KB)

Hi,

A feature I've wanted in lit for a while is a having a timeout per
test. Attached
are patches that implement this idea.

Cool, I hope this succeeds. I tried implementing per-test timeouts before, and couldn't get it to work in all cases. The review eventually fizzled out, and I abandoned it.

Here's that old review: ⚙ D6584 Add per-test timeouts to lit Perhaps you can cannibalize testcases from it.

I'm e-mailing llvm-dev rather than llvm-commits
because I want to gather more feedback on my initial implementation and
hopefully some answers to some unresolved issues with my implementation.

Currently in lit you can set a global timeout for
all of the tests but not for each individual test.

The attached patches add

* Support for a new ResultCode called TIMEOUT

When I implemented mine, one of the first revisions had this, but then @ddunbar said he'd prefer I didn't add a new ResultCode.

* A new command line option --max-individual-test-time

I think you should call it `--timeout=`, and then say in the description that it's a per-test timeout.

* Support for running external and internal ShTests with a per test timeout
* Support for running GTests with a per test timeout

I wanted to get some initial feedback on the implementation.

* If a timeout is requested the Python psutil module is required.
   This module does not ship with Python but is availble via pip
   (or on Linux your distribution's package manager). How do people feel
   about this? I don't like adding extra dependencies but this module
   makes it really easy to kill a process and all its children recursively
   in a platform neutral way. Note that if a per test timeout is not requested
   then the psutil module is not imported and lit acts just like it did
before my patches.

This must be the missing piece... I couldn't get my implementation to work without resorting to Python 3.x features (which is incompatible with a 2.x minimum version).

* If the platform running lit doesn't have psutil installed and a
timeout is requested
   an exception will be thrown. Should we provide a more friendly error message?

Yes.

   If so where should this go in lit's code?

Not sure off the top of my head, but probably somewhere near the rest of the argument parsing stuff. I'll have a look later.

* I've not tested these patches on OSX or Windows. Is anyone on those
platforms willing
   to give them a try?

Yes, I'll give it a whirl on Darwin early next week.

Mind squashing your two patches, throwing them on Phabricator, and cc-ing llvm-commits?

Thanks!

Jon

Hi,

A feature I've wanted in lit for a while is a having a timeout per
test. Attached
are patches that implement this idea.

Cool, I hope this succeeds. I tried implementing per-test timeouts
before, and couldn't get it to work in all cases. The review eventually
fizzled out, and I abandoned it.

Here's that old review: ⚙ D6584 Add per-test timeouts to lit Perhaps you can
cannibalize testcases from it.

I'm e-mailing llvm-dev rather than llvm-commits
because I want to gather more feedback on my initial implementation and
hopefully some answers to some unresolved issues with my implementation.

Currently in lit you can set a global timeout for
all of the tests but not for each individual test.

The attached patches add

* Support for a new ResultCode called TIMEOUT

When I implemented mine, one of the first revisions had this, but then
@ddunbar said he'd prefer I didn't add a new ResultCode.

* A new command line option --max-individual-test-time

I think you should call it `--timeout=`, and then say in the description
that it's a per-test timeout.

* Support for running external and internal ShTests with a per test
timeout
* Support for running GTests with a per test timeout

I wanted to get some initial feedback on the implementation.

* If a timeout is requested the Python psutil module is required.
   This module does not ship with Python but is availble via pip
   (or on Linux your distribution's package manager). How do people feel
   about this? I don't like adding extra dependencies but this module
   makes it really easy to kill a process and all its children
recursively
   in a platform neutral way. Note that if a per test timeout is not
requested
   then the psutil module is not imported and lit acts just like it did
before my patches.

This must be the missing piece... I couldn't get my implementation to
work without resorting to Python 3.x features (which is incompatible
with a 2.x minimum version).

* If the platform running lit doesn't have psutil installed and a
timeout is requested
   an exception will be thrown. Should we provide a more friendly
error message?

Yes.

   If so where should this go in lit's code?

Not sure off the top of my head, but probably somewhere near the rest of
the argument parsing stuff. I'll have a look later.

* I've not tested these patches on OSX or Windows. Is anyone on those
platforms willing
   to give them a try?

Yes, I'll give it a whirl on Darwin early next week.

I implemented the fixes I suggested, and my testcases from before. Works for me on Darwin. See attached.

Cheers,

Jon

add_per_test_lit_timeouts.diff (17.5 KB)

Hi,

Cool, I hope this succeeds. I tried implementing per-test timeouts before, and couldn’t get it to work in all cases. The review eventually fizzled out, and I abandoned it.

Here’s that old review: http://reviews.llvm.org/D6584 Perhaps you can cannibalize testcases from it.

Thanks for that. I’ll take a look.

I’m e-mailing llvm-dev rather than llvm-commits
because I want to gather more feedback on my initial implementation and
hopefully some answers to some unresolved issues with my implementation.

Currently in lit you can set a global timeout for
all of the tests but not for each individual test.

The attached patches add

  • Support for a new ResultCode called TIMEOUT

When I implemented mine, one of the first revisions had this, but then @ddunbar said he’d prefer I didn’t add a new ResultCode.

Okay. I’ll bear that in mind. I currently want to distinguish between a TIMEOUT and UNRESOLVED (I’m not sure what other conditions can lead to this result code) so I will keep it for now.

  • A new command line option --max-individual-test-time

I think you should call it --timeout=, and then say in the description that it’s a per-test timeout.

I agree a shorter name would be nicer. I’m worried about it being confused with --max-time though.

  • Support for running external and internal ShTests with a per test timeout
  • Support for running GTests with a per test timeout

This must be the missing piece… I couldn’t get my implementation to work without resorting to Python 3.x features (which is incompatible with a 2.x minimum version).

I did do brief testing with Python 2.7. It seemed to work okay.

  • If the platform running lit doesn’t have psutil installed and a
    timeout is requested
    an exception will be thrown. Should we provide a more friendly error message?

Yes.

Okay. I’ll add this before submitting to phabricator.

Mind squashing your two patches, throwing them on Phabricator, and cc-ing llvm-commits?

I’m travelling so I can’t do it right now but will do tomorrow.

Thanks,
Dan.

Hi,

> Cool, I hope this succeeds. I tried implementing per-test timeouts
before, and couldn't get it to work in all cases. The review eventually
fizzled out, and I abandoned it.
>
> Here's that old review: ⚙ D6584 Add per-test timeouts to lit Perhaps you can
cannibalize testcases from it.

Thanks for that. I'll take a look.
>
>>
>> I'm e-mailing llvm-dev rather than llvm-commits
>> because I want to gather more feedback on my initial implementation and
>> hopefully some answers to some unresolved issues with my implementation.
>>
>> Currently in lit you can set a global timeout for
>> all of the tests but not for each individual test.
>>
>> The attached patches add
>>
>> * Support for a new ResultCode called TIMEOUT
>
> When I implemented mine, one of the first revisions had this, but
then @ddunbar said he'd prefer I didn't add a new ResultCode.

Okay. I'll bear that in mind. I currently want to distinguish between a
TIMEOUT and UNRESOLVED (I'm not sure what other conditions can lead to
this result code) so I will keep it for now.

The reason not to is that there are lots of out-of-tree scripts that parse the output of LIT (which was designed to match DejaGnu's output). It's best not to force everyone update their scripts. Therefore tests that run past their allotted time should be marked as FAIL.

>
>> * A new command line option --max-individual-test-time
>
> I think you should call it `--timeout=`, and then say in the
description that it's a per-test timeout.

I agree a shorter name would be nicer. I'm worried about it being
confused with --max-time though.

The --help description should be good enough to distinguish them. It'll be important however to keep their naming consistent across the variables that are used to implement this.

Hi,

I've tinkering my patch today and I've placed the new version on
phabricator for review

http://reviews.llvm.org/D14706

Okay. I'll bear that in mind. I currently want to distinguish between a
TIMEOUT and UNRESOLVED (I'm not sure what other conditions can lead to
this result code) so I will keep it for now.

The reason not to is that there are lots of out-of-tree scripts that parse
the output of LIT (which was designed to match DejaGnu's output). It's best
not to force everyone update their scripts. Therefore tests that run past
their allotted time should be marked as FAIL.

Okay. I've already put my patch on phabricator and I have not changed
anything with respect to this.

For my particular use case it is not desirable to have FAIL mean
either the execution failed in some way or the test ran out of time.
If we can't reach consensus here it might be necessary to add some
sort of flag (--deja-gnu-result-codes) that is on by default that
causes TIMEOUT to appear as UNRESOLVED but can be disabled.

>
>> * A new command line option --max-individual-test-time
>
>
> I think you should call it `--timeout=`, and then say in the
description that it's a per-test timeout.

I agree a shorter name would be nicer. I'm worried about it being
confused with --max-time though.

The --help description should be good enough to distinguish them. It'll be
important however to keep their naming consistent across the variables that
are used to implement this.

Again I haven't changed the naming I used in my initial patch.
Although ``--timeout=`` is convenient to use from the command line
(and I'd be willing to change that)
using that name inside lit's code is very ambiguous so I would not
want to use that name inside the LitConfig object.

Thanks,
Dan.