Hi,
thanks a lot for fixing the timeout issue on such a short notice.
Sure thing. I didn't mind at all once I understood the context in which
they were being used.
I
didn't think I'd find myself defending them, as I remember being quite
upset when they went in, but they have proven useful in stabilising
the build bots,
Yes, I understand. I am definitely not a fan of tests that sometimes give
useful signals (tests that don't lock up), but sometimes lock up (and
therefore don't give useful signals). But since the lock-up is catchable
via the timeout, and since we can figure that out definitively, this isn't
quite so bad as it seems at first blush.
and I think it's likely you may need them as well.
I am hopeful the test rerun mechanism generally handles it, but that'll
only be the case if the timeout is triggered by heavy load. Otherwise, the
rerun phase will be no different. Time will tell 
I'll try to now add a nicer way to expect timeouts so that we don't
have the hack in the new runner as well. I'll add a new message, like
you did for the flakey decorator.
Feel free to mimic what I did with the flakey decorator sending a new test
event that can additively apply the "'expected timeout" marking to the test
method run. In fact, if you get the decorator set up and put in a
# TODO @tfiala add test event code here
I can get the rest of it in.
I'm a bit uneasy about adding another kind of a decorator though. What
would you (and anyone else reading this) think about adding this
behavior to the existing XFAIL decorators?
This way, "timeout" would become just another way in which a test can
"fail", and any test marked with an XFAIL decorator would be eligible
for this treatment.
I think I'd be okay with that. I like reducing the number of ways we
support failure decoration, so this seems reasonable.
We would lose the ability to individually expect "failures" and
"timeouts", but I don't think that is really necessary, and I think it
will be worth the extra maintainability we get from the fact of having
fewer test decorators.
OTOH, the piece we then lose is the ability to have an XFAIL mean "Hey this
test really should fail, we haven't implemented feature XYZ (correctly or
otherwise), so this better fail." In that semantic meaning, an unexpected
success would truly be an actionable signal --- either the test is now
passing because the feature now works (actionable signal option A: the
XFAIL should come off after verifying), or or the test is passing because
it is not testing what it thought it was, and the test needs to be modified
to more tightly bound the expected fail condition (actionable item option
B).
So it eliminates the definiteness of an XFAIL ideally meaning "this really
should fail," turning it into "it is permissible for this to fail."
All that said, our Python test suite is so far away from that ideal right
now. The highest level output of our test suite that I care about is "if
tests run green, this is a good build", and if "tests run red, this is a
bad build." I don't see the timeout being rolled into XFAIL as hurting
that. It seems reasonable to roll them together at this time. And the
test output will list and count the timeouts.
So I'd be okay with that at this time in the sake of simplifying markup for
tests.
What do you think?