Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Returning a value from inside a nursery block behaves counterintuitively #1493

Open
oremanj opened this issue May 6, 2020 · 4 comments
Open

Comments

@oremanj
Copy link
Member

oremanj commented May 6, 2020

Quick: what do you expect the result of this code to be?

async def calculate_1():
    async with trio.open_nursery() as nursery:
        nursery.start_soon(trio.sleep_forever)
        await trio.testing.wait_all_tasks_blocked()
        nursery.cancel_scope.cancel()
        return 42

async def calculate_2():
    with trio.CancelScope() as scope:
        async with trio.open_nursery() as nursery:
            nursery.start_soon(trio.sleep_forever)
            await trio.testing.wait_all_tasks_blocked()
            scope.cancel()
            return 42

@trio.run
async def test():
    print(await calculate_1())
    print(await calculate_2())

In both cases, the nursery nested child task exits normally (returning a value) while some other task in the nursery exits with an exception (Cancelled, in this case). One might expect both cases to behave equivalently. But I get 42 from calculate_1() and None from calculate_2(). When the innermost context manager decides not to suppress the exception, the return value is lost so the exception can continue to propagate.

One can construct other confusing cases along the same lines, such as:

  • If a background task raises an exception, the return value is also lost.
  • If one returns without remembering to cancel_scope.cancel(), the background tasks keep running, maybe forever. Empirically users seem to be surprised by this -- they know that falling off the bottom of a nursery block waits to join the tasks, but the expectation is that a return would behave differently. Ditto for break and continue if the loop is outside the nursery.

From __aexit__ we can't directly tell the difference between a control flow transfer and a simple fall-off-the-end. (It might be possible with bytecode introspection, but IMO that's much too high a cost to incur on every nursery exit, and the patterns you'd need to recognize changed substantially in 3.8.) So our options here are somewhat limited.

One option would be to not change anything in code, but just document this pitfall. Maybe even in a section of "common pitfalls" with some others we've seen.

The only other option I can think of right now would be to add a nursery method that can be called to indicate "I'm done with this nursery, I've completed the high-level task I was using it for, please don't interfere with me as I return that result". This might look something like: cancel all tasks in the nursery and filter out any Cancelled exceptions. One could imagine a stronger version that filters out all exceptions, though that's getting close to dangerous territory IMO.

Perhaps this should be a feature of "service nurseries" (as discussed in #147) rather than all nurseries, since it reinforces the idea that there's a difference between the nested child task and all the other tasks.

@Zac-HD
Copy link
Member

Zac-HD commented Jul 30, 2024

Is there a clear use-case for return (or break/continue for an external loop) inside a nursery?

All I can think of is "the code is a bit shorter", and given the potential confusion here it really doesn't seem worth saving the line or two to dedent the relevant statements. If that's it, it seems like a useful lint rule!

@TeamSpen210
Copy link
Contributor

I can only really think of two situations where it'd make sense, both where you probably want to do something different anyway. First is a task receiving some sort of "quit" message and wanting to exit, but that should probably also cancel the scope first. So the return could probably just be replaced by a checkpoint(). Second is where you have some tasks doing some sort of search, then you return whenever you found the thing. But that would probably need to handle multiple detecting the value, so you'd want to accumulate into a list or something. And also cancel the other tasks, so a checkpoint would work better anyway.

Maybe what would be something more elegant is something like an await scope.cancel_now(), which would cancel the cancel scope, then checkpoint to raise Cancelled. Or errors if the call is not actually done inside the scope. Though aside from the sanity-check it's just two straightforward calls.

@oremanj
Copy link
Member Author

oremanj commented Aug 9, 2024

@TeamSpen210 There's a bunch of historical discussion in #658 that you might find relevant to this question. I would support an async cancel_now() function -- I might name it cancel_me to be clearer that it can only be called on an enclosing scope. While #658 concluded that we didn't really need it (since it's just cancel() + checkpoint()), I think it's still valuable, especially because it could be marked noreturn which would improve comprehensibility to static typing.

@Zac-HD I think a lint rule against return in a nursery block is a very good idea. It should probably also forbid break and continue if they would affect a loop that is outside the nursery.

@Zac-HD
Copy link
Member

Zac-HD commented Sep 7, 2024

We've updated flake8-async with a new lint rule, ASYNC121 / control-flow-in-taskgroup, which warns on these patterns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants