Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It's very annoying that fixtures can't yield inside a nursery/cancel scope #55

Closed
njsmith opened this issue Jul 25, 2018 · 3 comments
Closed

Comments

@njsmith
Copy link
Member

njsmith commented Jul 25, 2018

Right now, writing code like this is broken:

@pytest.fixture
async def background_task_fixture():
    async with trio.open_nursery() as nursery:
        await nursery.start(...)
        yield
        nursery.cancel_scope.cancel()

The problem is that the cancel scope semantics assume that they're strictly nested and can "see" all exceptions that happen inside them, but the convention for pytest fixtures is that we never throw in any exceptions. Also, right now we setup and teardown fixtures in a strict sequential order, but it might be nice to someday do that concurrently, and if we do then this example would really explode (you'll get assertion errors inside trio/_core/_run.py). (Previous discussion: #25 (comment))

For this particular case, that's not a big deal – you can write it as

@pytest.fixture
async def background_task_fixture(test_nursery):
    await test_nursery.start(...)

and that's actually better than the original.

But, as @touilleMan points out, there are other cases where this is really quite a problem. Specifically, cases where the use of a nursery is abstracted away inside some other context manager. For example, consider these two fixtures:

@pytest.fixture
async def tcp_client():
    async with await open_tcp_stream(...) as stream:
        yield stream

@pytest.fixture
async def websocket_client():
    async with open_websocket(...) as websocket:
        yield websocket

The first one is fine; the second one uses a nursery internally, so it's broken. But to realize that, we have to know what's going on internally in some random third-party library, and that's pretty terrible. So... what can we do to make this better?

@njsmith
Copy link
Member Author

njsmith commented Jul 25, 2018

So first, there is a way to rewrite our websocket fixture above so that it works. This is ugly, but at least it's a workaround while we're figuring out better options:

async def hold_websocket_in_task(*, task_status=trio.TASK_STATUS_IGNORED):
    async with open_websocket(...) as websocket:
        task_status.started(websocket)
        # Do need to be prepared for an exception here... in this example it's fine, but in others it 
        # might take some care
        await trio.sleep(float("inf"))

@pytest.fixture
async def websocket_client(test_nursery):
    return await test_nursery.start(hold_websocket_in_task)

But surely we can do better than that though.

One temptation is to say fine, this wouldn't be a problem if pytest fixtures actually worked like context managers, so maybe trio fixtures should act like context managers. (Well... this wouldn't let us parallelize fixture setup/teardown, but it would at least solve the sequential case.) However, this would break a lot of existing fixture patterns, e.g. this example from the docs:

   @pytest.fixture
   async def rolback_db_connection():
       # Setup code
       connection = await some_async_db_library.connect(...)
       await connection.execute("START TRANSACTION")

       # The value of this fixture
       yield connection

       # Teardown code, executed after the test is done
       await connection.execute("ROLLBACK")

Also, the way our fixture handling works, you actually need to know a pretty complex set of rules before you can tell which fixtures are Trio fixtures, and which are pytest fixtures. Especially the "if it depends on a trio fixture, it's a trio fixture" rule would be pretty confusing if it also totally changed the semantics of yield in this subtle-yet-fundamental way. If we were to go this way, I think we'd really have to force people to use @trio_fixture instead of automagically supporting @pytest.fixture, because the magic is too, well, magic. (I guess this wouldn't be the end of the world anyway?)

But really the problem isn't exceptions in general... and we don't particularly want to allow fixtures to catch and transform errors from tests; that could get super confusing. The problem is specifically Cancelled exceptions. Maybe we can somehow make it so those exceptions get thrown into fixtures, but other ones don't? And in fact it's not even all Cancelled exceptions we want to throw in, just ones that actually belong to cancel scopes opened inside that fixture – in particular, if a fixture didn't actually open any cancel scopes, then throwing in Cancelled is not really cool. But it's hard to figure out which exceptions those are from the outside.

Also, we don't want cancel scopes inside fixtures to be wrapped around other fixtures or the test code, probably.. i.e. if they become cancelled it shouldn't cause other tasks to see Cancelled errors. Well, except if there's an unexpected error, I guess... if a fixture background task crashes, we should cancel the test, and maybe all the fixtures that depend on it, since the thing they were depending on is not working anymore. It seems rude to go cancelling fixtures that don't depend on it though, as opposed to letting them clean up normally. I'm not sure all fixtures are prepared for cancellation.

(Actually, there's an argument that the current test_fixture is too coarse-grained too, and that what we instead want is a per-fixture nursery that automatically gets cancelled and waited on at the moment when that specific fixture would be cleaned up.)


So.... putting all that together..... maybe there is a solution. Looking at the "workaround" above, the trick is that we push the actual fixture setup/teardown off into a separate task, so that it can't enclose anything, and it mostly can't get any exceptions that aren't its fault – the exception being the final Cancelled from the test_nursery being torn down. But that's OK, because even if it doesn't enclose the test, the nursery block still has the right lifetime, and that's what's really important – it means the test can use the websocket just fine.

What if we make this our general strategy for running fixtures? Every fixture gets its own task, that runs both the setup and teardown code, and we carefully arrange the lifetimes of these tasks so that we trigger the teardown code at the right moment. And... somehow manage the flow of exceptions carefully so that only the right ones go in.

I think I need a whiteboard or something to work out the details, and this is already long, so I'm going to hit post. But that's the best idea I've had so far.

@njsmith
Copy link
Member Author

njsmith commented Aug 2, 2018

Still working on this. Here are some possibly-incomprehensible notes I scribbled down, so they don't get lost:

Setup:

  1. Get a list of all trio fixture objects
  2. Mark the topmost ones as depended on by the test
  3. Then loop through and mark depended on ones in general. Now every "runnable thing" (fixtures or the test) knows which other runnable things it depends on, and which ones depend on it.
  4. Then loop through and start every runnable thing in its own task

For each fixture:

  1. Wait for dependencies to be setup

  2. If any crashed, abort (and tell reverse dependents to abort)

  3. Execute the fixture's setup code. If this raises an exception:

    1. save the error in a list somewhere
    2. mark ourselves as crashed, so our reverse dependents know to abort
    3. mark ourselves as finished tearing down
  4. Announce that we're finished setting up (so our reverse-dependents know)

  5. Wait for reverse dependents to finish their tear down. If this raises an exception (which can only happen if the fixture opened a nursery/cancel scope – usually this will be because some background task the fixture is running has crashed and triggered the nursery cleanup logic):

    1. Discard Cancelled errors; anything else (I guess just KeyboardInterrupt?) put into the list
    2. Cancel the test (the actual top-level test)
    3. Resume waiting for reverse dependents, this time with a cancel-scope shield
  6. Execute the fixture's teardown code. If this crashes:

    1. Save the error in the list
  7. Announce that we're finished tearing down (so our dependents know)

For the test itself:

  1. Wait for dependents to be setup

  2. If any crashed, abort

  3. Set up a cancel scope (so step 5.ii above can cancel it if needed)

  4. Run test. If it crashes:

    1. Save the error in the list
  5. Tell everyone that we're torn down

After everything has finished: take our list of errors, and raise as MultiError

Replace the current nursery fixture with a magic fixture_nursery that in each fixture gives you a nursery whose lifetime matches that fixture. Implementation: we make a @pytest_fixture called fixture_nursery that just evaluates to None or a sentinel value. Then when evaluating a fixture, if it takes this as an argument, set up a nursery and use it for the fixture_nursery argument (instead of the sentinel value). After finished executing the fixture teardown code, but before signalling that teardown is complete, cancel and then exit the nursery. (Why cancel the nursery after teardown, instead of before? The nursery surrounds the actual fixture code, so if we cancelled it before then it would also cancel the teardown code.)

@njsmith
Copy link
Member Author

njsmith commented Aug 15, 2018

I just pushed a commit to #50 that implements the strategy described above. It's a huge rework, but I'm pretty happy with how it came out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant