lldb tests and tear down hooks

There’s a subtle bug that is pervasive throughout the test suite. Consider the following seemingly innocent test class.

class MyTest(TestBase);
def setUp():
TestBase.setUp() #1

Do some stuff #2

self.addTearDownHook(lambda: self.foo()) #3

def test_interesting_stuff():
pass

Here’s the problem. As a general principle, cleanup needs to happen in reverse order from initialization. That’s why, if we had a tearDown() method, it would probably look something like this:

def tearDown():

Clean up some stuff #2

TestBase.tearDown() #1

This follows the pattern in other languages like C++, for example, where construction goes from base → derived, but destruction goes from derived → base.

But if you add these tear down hooks into the mix, it violates that. tear down hooks get invoked as part of TestBase.tearDown(), so in the above example the initialization order is 1 → 2 → 3 but the teardown order is 2 → 1 → 3 (or 2 → 3 → 1, or none of the above depending on where inside of TestBase.tearDown() hook the hooks get invoked).

To make matters worse, tear down hooks can be added from arbitrary points in a test’s run, not just during setup.

The only way I can see to fix this is to delete this tearDownHook mechanism entirely. Anyone who wants it can easily reimplement this in the individual test by just keeping their own list of lambdas in the derived class, overriding tearDown(), and running through their own list in reverse order before calling TestBase.tearDown().

I don’t intend to do this work right now, but I would like to do it in the future, so I want to throw this out there and see if anyone has thoughts on it.

I think we can remove these, provided there is a way to mimic the
functionality they are used for now, which I think shouldn't be hard.
Anything which was set up in the setUp() method should be undone in
tearDown(). Anything which was set up in the test method, can be
undone using a try-finally block. Is there a use case not covered by
this?

pl

Yea, that’s what I think too. I think this mechanism was probably invented to just to save some code and promote reusability, but in practice leads to these kinds of problems.

I think it was mostly done to provide an exception safe way to cleanup stuff without having to override TestBase.tearDown(). I am guessing this cleanup happens on TestCase.tearDown() and not after the current test case right?

I could see it being used to cleanup after a single test case in case you have:

class MyTest(TestBase):
    def test_1(self):
        self.addTearDownHook(lambda: self.foo())
        raise ValueError
    def test_2(self):
        self.addTearDownHook(lambda: self.bar())
        raise ValueError

Are these tearDowns happening per test function, or during class setup/teardown?

Well you can see them getting added via self.addTearDownHook(), so that means they’re called through an instance. Specifically, it happens in Base.tearDown(self), so it’s basically identical (in concept) to if the relevant handlers were called in the implementation of MyTest.tearDown(), but different in order.

I agree that it’s useful in principle to be able to do what you suggest in your example, but there’s just no way to guarantee the right ordering if you let the base class run the handlers. If there actually were a tearDown() function in your example, to be correct it would need to look like this:

class MyTest(TestBase):
def tearDown(self):

run the teardown hooks

Do the inverse of setUp()

super.tearDown()

def test_1(self):
self.addTearDownHook(lambda: self.foo())
raise ValueError
def test_2(self):
self.addTearDownHook(lambda: self.bar())
raise ValueError

One possible solution is to shift burden of maintaining the hooks list to the individual test case. E.g.

class MyTest(TestBase):
self.hooks = []
def tearDown(self):
self.runTearDownHooks(self.hooks) # This could be implemented in TestBase, since now we can call it with our list at the right time.

Do the inverse of setUp()

super.tearDown()

def test_1(self):
self.hooks.append(lambda: self.foo())
raise ValueError
def test_2(self):
self.hooks.append(lambda: self.bar())
raise ValueError

Almost no extra code to write, and should be bulletproof.

(And side note: if you’re pushing a “lambda: self.foo()” with no arguments, the lambda is unneeded and you can just push “self.foo” — that cleanup hook pushed on most tests at the end of the file is a perfect example of an unneeded level of lambda indirection).

So this is biting us again, for unrelated reasons. I thought of a potentially interesting solution. Right now each test class has a setUp and tearDown method. This method is called before and after EVERY test method in the class. So you can’t put anything in these methods unless it’s common to all test methods. If one particular test method creates a file on the filesystem and wants to clean up that file, you can’t use setUp / tearDown because then every test method will be creating / cleaning up files that they don’t need.

So what if tests could be either a method or a nested class. If it’s a nested class, it could provide setUp, tearDown, and run methods. These setup and teardown methods can do whatever they want specific to the individual test, and it also provides the exception safe way to clean up. For example, instead of this:

def test_method(self):
self.addTearDownHook(/* do cleanup here */)

run test

You could have this:

def TestClass(TestMethodBase):
def tearDown(self):

/* do cleanup here */
TestMethodBase.tearDown(self)
def run(self):

run test

It’s a little bit more code, but a lot more flexible. And we still support the old method of using test methods too, so you only use this when necessary.

Thoughts?

I don't think this is supported by unittest, which is what determines
what constitutes a "test". Nothing that couldn't be hacked around, but
I don't see the added value over just using a finally block.

Was there any reason we couldn't use that? I don't think it prevents
code reuse, and it's a standard way of doing things that everyone
should be familiar with.

pl

There’s no reason a finally block inside each test method wouldn’t work, and that’s probably the simplest solution.