-
-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
idea: unbound cancel scopes #607
Comments
Shielding also introduces quite a bit of complexity internally – especially since for any cancel scope, at any time, you can mutate it's
For now we can do all this without changing the internals (e.g. That still leaves the question about @contextmanager
def move_on_at(deadline, *, save_exceptions=False):
cancel_scope = CancelScope(deadline=deadline)
with cancel_scope.enter(save_exceptions=save_exceptions) as cancel_status:
yield cancel_status And then anyone who wants to set a deadline, AND ALSO be able to adjust the deadline or explicitly cancel the scope, would need to drop down to the lower-level API instead of using # Currently you can do this:
with move_on_after(10) as cancel_scope:
...
# Whoops, need some more time:
cancel_scope.deadline += 10
...
if cancel_scope.cancelled_caught:
...
# But if we went this way, you'd have to go fully explicit:
cancel_scope = CancelScope(deadline=trio.current_time() + 10)
with cancel_scope as cancel_status:
...
# Whoops, need some more time
cancel_scope.deadline += 10
...
if cancel_status.cancelled_caught:
... That's not so terrible, but it's a bit cumbersome. This case is probably relatively rare; in trio's own source code, the only thing we do with the cancel scope from We could make it so the I'm not real happy with any of these options. So let's back up: if all the solutions are bad, maybe we can change the problem. The issue here is that if multiple tasks enter the same cancel scope, then the exception-handling related attributes become ambiguous. What if we ... don't let multiple tasks enter the same cancel scope? Up above I said that this would be fine, and except for this wrinkle it would be; I also suspect it would be useful in some cases. (E.g., you could have a cancel scope that you use to shut down all tasks of type X, where each task of type X enters it when it starts up.) Can we do an end run around that problem... what if we say each cancel scope can only be entered once, but you can somehow clone a cancel scope and then each clone can be entered once? Maybe: If that's the plan, we could potentially go back to having |
Another reason why it'd be nice not to break up the cancel scope object is that then we'd have to figure out what to do with |
@njsmith Really interesting ideas! I specially like the "full decomposed design" route as a way to think about it all and explore possibilities. I also agree in generat that it's somehow necessary to "decompose" and "expand" de possibilities before trying to "simplify" the "user experience" or trying to simplify the API too soon (I mean, to soon in the design process. I see it as a kind of "too-early optimization" in the design process, with all that means!). As further help to reason about what "elements" and "responsabilities" are needed and why, I think it would be useful to have a set of emblematic use cases or examples making different uses of the "available" (fully decomposed) APIs. You have already given some examples above, but it would be useful to explore some other use cases trying to make use of the new "available" elements. Some ideas in the form of questions could be:
|
This PR is another use case where unbound scopes would help, I think: urwid/urwid#298 |
@xgid Interesting questions!
Generally, cancellation is triggered by either a timeout or someone calling
What sort of data?
You could in principle have a shield that protects a block of code from specific surrounding scopes/bindings but not others. I'm a bit nervous about this because it feels like it would encourage lots of confusing/fragile hacks, and is complicated to implement. Generally the idea of functions is to be an abstraction over callers – and trio's cancellation system nudges you in that direction by not allowing fine-grained access to information about caller scopes. This does remind me a bit of #147, which discusses the idea of making "soft cancellation" a new state that's in between not-cancelled and cancelled – by default it wouldn't do anything, but a particular piece of code could request that soft-cancellation be upgraded to full cancellation. This could be useful for implementing graceful shutdown. So that's adding additional granularity in a slightly different way: the canceller can send slightly different information (though you still can't select on source of the cancellation). I suppose a super-fancy version of this would be to allow user-defined cancellation states, like allow I'm having trouble coming up with anything that would justify so much complexity, though :-) |
Two issues I frequently struggle with:
Unfortunately, |
Interesting problem! It has some challenges because of the cost of capturing the stack... and I think it's otherwise orthogonal to the issues here, so can you open a separate bug so we can discuss it properly?
Yeah, this is one of the downsides of "stateful cancellation". I think stateful cancellation is still (probably?) the right way to do things, but it does force us to come up with solutions to issues like this. One thing you're probably aware of is |
I see. I totally missunderstood the idea of the "cancel binding"! I thought it was a sort of bidirectional link: for the task to be able to send a "cancel" to a cancel scope and for the cancel scope to notify the cancellation (in-process) to the task. According to your comments, it is only for the "Cancel scope to task" notification means.
My idea was for data that the cancel scope could use to provide information about the task that requested the cancellation or to even block the cancellation process according to some properties of this data... but that only made sense when I thought that the cancel binding was bidirectional. Now that I know your idea of a cancel binding I don't see any useful data to "store" there.
I agree to not giving access to information about "caller scopes"... but I don't see a problem in letting the caller send to the cancel scope some information about him or about the cancellation "reason" when he invokes the cancel operation on the cancel scope. This is seeing the cancel operation as a method of the cancel scope that can be explicitly called.
That's more or less the idea. Although I don't understand your final: "though you still can't select on source of the cancellation". What do you mean?
You can see it as "cancellation states" or just as "relevant information that the canceller wants to broadcast to the other tasks affected by the cancellation scope so they can better decide what to do". Does that make sense to anyone when seen this way?
First, it doesn't have to be complex. It is not complex if we see it just as a "way to send extra/contextual information to all tasks affected by the cancellation process" by doing it through the cancel scope itself. And second, I insist in not trying to desperately simplify the overall design too soon. It would only deter creativity. We will have time for that. Let's give us some more time to find examples and situations where it may be useful. I neither have them yet, but let the river flow... and we'll see where it takes us. |
What's the use case? Why should a cancelled task need to know who triggered the cancellation or why? It might need to distinguish hard vs. soft cancel (terminate the connection vs. abort the current operation), but (a) that's a tightly-constrained value instead of an arbitrary reason, (b) often we just don't know (if a nursery cancels you because of an exception in another task, how would it know which type of cancellation to trigger?), (c) you can run the connection in a different cancel scope than the operation, thus the problem may not be that prevalent in the first place. You want to distinguish different types of cancellation? you create different scopes and then decide which scope(s) to run the task under – @njsmith 's ideas already let you do that. What's more, I don't want a task to be able to dynamically decide to ignore a cancellation, or what to do depending on an unconstrained tag value. It's a cancellation. You get cancelled. End of story. If you want to do something else, call it something else – i.e. add code to throw an exception into a scope. That would allow the code within the scope to recover, which a cancellation intentionally does not.
I disagree. The current design is "simple". We already are complicating it by splitting off a Trio works because it intentionally limits things you're able to do, which enables new concepts to emerge (cf. "Go[to] Considered Harmful"). We shouldn't add interesting things to it before we have identified a use case that requires them. I don't see a use case for tags. Even if one should emerge, we can add it later. That's better than allowing people to (ab)use a tagged cancel for things cancellation was never intended to do. |
Oh, I see! Yeah, by "cancel binding" I just mean, like "a specific
When initially designing cancel scopes, I considered trying to track the "reason" for a cancellation (e.g., timeout versus someone calling That said, it would be possible to expand from a single boolean to like... multiple booleans. (The "tagged boolean" idea.)
I mean you could say "I only care about cancel scopes tagged with BLAH", but not, "I only care about cancels from from cancel scope object X, not cancel scope object Y".
Sure, that kind of low-stakes exploring ideas is exactly what this kind of issue is for. But I'm just noting that this kind of complexity will need to find some pretty compelling use cases before it can graduate from "let the river flow" to something we actually do. Cancellation is inherently a viciously complicated thing to understand and use. Trio works really hard to make it as simple as possible, and it's still far from trivial. So when brainstorming my main goal is to find ideas that will make it easier to understand or solve common practical problems. |
In the discussion in #658, we realized that these two features seem to be essentially incompatible:
See #658 (comment) |
Hello @NJSmit, @smurfix! I'm really sorry for not being able to follow up with this conversation before, but I had some serious health issues that prevented me from doing it. Hopefully I'm better now, so I'll try to comment briefly on some of the great answers that you both made to my questions. It's been a long time since my last contribution and some things have already changed alot, but I prefer to answer here to the points that I left unanswered. I hope that you understand the reasons.
Well, the reasons are very similar to the ones that drove you to open #626, although I would like to have more than just the traceback because I don't want it just for debugging purposes. I want the tasks to receive as much "context" about the origin/reason/cause of the cancellation as possible, so I can decide in every task what to do (though at the end I will finally follow the cancel anyway, of course.)
I'm not suggesting to ignore/stop the cancellation process, just to have more information on the overall situation. If the timeout expired, we are in a different situation than if it didn't. Maybe I can do more things before exiting.
It's another possibility I had not considered yet. I'll think about it.
I totally agree! I was just suggesting exploratory ideas in the inception fase of the design process. I never pretended to make them all end up in the code without a compelling reason!
I have to clarify that the tagged cancels idea was not mine and I think I missunderstood the workings. Now I see that my proposal had nothing to do with the tagged cancels. I was just talking about broadcasting more information to the cancelled tasks attached to the
I see. Thanks for the detailed explanation. Then, maybe we should make the
OK, I totally agree with that! I mean I agree with that you can't select on source of the cancellation - not the "tagged" part (which I don't care).
Totally agree! I have yet to come with the practical problems!... so just let's move on and see what happens! 😛 My thanks to you both for your time and dedication! |
I've recently run into a few places where I want a cancel scope for some code that may or may not already be running. For example, in pytest-trio, if a fixture crashes you want to cancel the main test... but these run in different tasks, so it's tricky to find the main task's cancel scope and put it somewhere that a fixture can get at it, without race conditions.
It's possible to create an object that sort of acts like a cancel scope, but where you can call
cancel
before or after the code inside the scope starts running. But it's fairly tricky to get all the cases right, e.g.:Maybe we should make this just... how cancel scopes work, always? Right now
open_cancel_scope
is a context manager, so it forces you to immediately enter the scope it creates. But we could reinterpret it as returning an unbound cancel scope object, and then thewith cancel_scope: ...
as entering that scope – it'd even be backwards compatible!Implementation-wise, I think it'd be almost trivial. The one thing to watch out for is that it'd become possible to attempt to re-enter a scope that you're already inside, which would be complicated (e.g. instead of keeping a set of which tasks are inside the scope, we'd have to keep a dict of task → refcount). For now we should just error out if someone tries to do this. (OTOH, I think having multiple independent tasks entering the same scope is fine and would Just Work.)
Maybe we should also make
CancelScope
public? Right now it's hidden in order to keep the constructor private, but that would be unnecessary in this approach – in factopen_cancel_scope(...)
would just bereturn CancelScope(...)
, so maybe we'd even want to deprecate it or something.One limitation of this approach is that
cancelled_caught
would become ambiguous if multiple tasks can enter the same scope. It might not matter.Alternatives
There's a larger design space here of course. Cancel scopes are inspired in part by C#'s cancellation system, which has "cancel sources" – which let you call
.cancel()
, and set deadlines – and "cancel tokens" – which are read-only objects that let you check whether the corresponding source has been cancelled and what its deadline is. You can also combine multiple tokens together to create a new cancel source, that automatically becomes cancelled when any of the original tokens are cancelled. (I'm not sure why this creates a new source, rather than creating a new token. I think it doesn't matter which way you define the API though, each version can basically be implemented in terms of the other. Also, for some reason C# doesn't actually provide any API for querying for the current deadline given a source or a token, but this is silly so I'm going to ignore it.)In Trio's current system, cancel scopes = cancel sources, and there is no reified object corresponding to cancel tokens – they're implicit on the cancel stack associated with a task, and you can query this implicit state using
current_effective_deadline()
. So in addition to introducing the idea of an "ambient" cancel token, we're also quite aggressive about collapsing together the different ideas here.If we wanted to fully decompose the space, you can imagine operations:
with
block to bind a given cancel token to the current task, which produces a "cancel binding"cancelled_caught
attribute)This is almost certainly too fine-grained a decomposition, but I find it useful to see it all laid out like that... and it does allow for things we can't do right now, like check whether another task's ambient context has been cancelled (by extracting its cancel token and then querying it later). Or a minor feature that curio has, and I envy: if you enter a thread with
run_sync_in_worker_thread
, and then the thread comes back into trio withBlockingTrioPortal.run
, and the originalrun_sync_in_worker_thread
is cancelled... it would be neat if this caused the code inside theBlockingTrioPortal.run
call to raise aCancelled
error that propagated all the way back out of trio, through the thread, and back into trio.Though actually... the "fully-decomposed" design is still not powerful enough to allow that! I was thinking you could do it by having
run_sync_in_worker_thread
capture the ambient token and then insideBlockingTrioPortal.run
we could dowith the_ambient_token
... but this doesn't quite work, because it would create a new binding. Ifrun_sync_in_worker_thread
was cancelled, then the code inside theBTP.run
call would raiseCancelled
but that exception would be caught at thewith the_ambient_token
, instead of propagating into the thread and then back into trio.Cancelled
exceptions are associated with cancel bindings, not cancel tokens or cancel scopes. Hmm! Well, at least the decomposed design gives us useful vocabulary :-).It's not clear whether propagating cancellation across threads is really that important. But if we do want to do it... [longish text split off into #606, since it doesn't seem to be too related to this issue after all].
Other things to consider: as noted in #285, we might want to capture actual exceptions for each binding, which has the same issues as
cancelled_caught
, but even more so.I'm not sure how shielding fits into the above picture. In the fully-decomposed picture, I think a shield would be a separate kind of thing, where you just do
with shielded(): ...
, since it's above managing the binding stack. Having a.shield
attribute on cancel sources or cancel tokens doesn't make much sense conceptually.Given the above, I'm having trouble thinking of cases where capturing a task's ambient context state in the form of a token is actually useful.
I'm not sure how useful the source/token distinction is for trio, given that the actual message delivery is via the ambient state (unlike C# where the token object is important because you have to manually examine it all the time to check if you're cancelled). And
current_effective_deadline
is sufficient for examining the current ambient state. Also, since a token's functionality is a subset of a source's functionality, we always add tokens later without breaking anything. (So e.g. we'd still have to supportwith source: ...
, but that's fine, it'd just be a shorthand forwith source.token: ...
.)So I think the 'unbound scopes' idea captures most of the valuable parts of the "fully decomposed" design, except that I'm a little nervous about bindings – it's a little weird to have
cancelled_caught
/ #285 state and shielding associated with the scope rather than with awith
block.If
CancelScope
became a public class with a public constructor, and we want to transition fromwith open_cancel_scope() as scope: ...
toscope = CancelScope(); with scope: ...
as being the primitive operations... then we have the opportunity to makescope.__enter__
return whatever we want, it doesn't have to returnself
. It could return something like a binding object. Or returnNone
for now, and we reserve the right to add something like a binding object later.This would cause some disruption for
move_on_after
etc., though, since they return the actual cancel scope object, and it is fairly ordinary to call.cancel()
on this object, as well as to check.cancelled_caught
. I suppose if we had to we could in the future declare that there's bothCancelScope.cancelled_caught
andCancelBinding.cancelled_caught
, and the former says something like "did any binding catch something" and the latter is more specific.For shielding... it's a bit weird to have a shielded cancel scope you enter later, or in multiple tasks, or where your scope's shield attribute can get toggled by someone somewhere else who you wanted to let cancel you... but maybe there's no harm in allowing these things? I guess it's worth at least taking a peek at how hard it would be to split shielding off into its own thing. FWIW, currently every non-test use of shielding in trio is exactly
with trio.open_cancel_scope(shield=True): ...
. (This would also let us move shielding into hazmat!)Possibly the shielding discussion should be split into a separate issue, too, since it's kind of orthogonal to the unbound cancel scopes idea. The
cancelled_caught
part is more closely related.CC: @1st1, on the theory that you're probably thinking about similar issues
The text was updated successfully, but these errors were encountered: