-
-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What's the deal with partial cancellation #889
Comments
The use case is pretty common, I think:
For many subprocesses, their output is a good indicator of what they were doing when they were killed. I agree there's nothing subprocess-specific about the "right" solution to this problem; I wanted to solve it in a subprocess-specific way mostly because I wasn't sure how to solve it generally. Here's a sketch of a potential more-general solution: we could have things like
It could also be a different object; I just figure the cancel scope is handy and available. It could also be done implicitly (without having to be requested), and ambiguities resolved using an appropriate dictionary key. But maybe this is unwarranted complexity for an unproven problem. |
HMMM. I think this may be on to something! Using a cancel scope seems a bit ad hoc, like you say. What's the range of options? Let's take
At a first glance, to me the empty |
I like the |
On further thought, it should be more like Do we have any examples of places where we want to use this besides |
(For |
@jab (Important context: Trio generally follows the rule that if an operation completed successfully, then it doesn't raise |
Thanks @njsmith! So only in the "0 bytes downloaded" case would |
That's right. Cancellation on # pipes.py
import os
import trio
async def main():
r, w = os.pipe()
b = b"a" * (65536 * 2)
async with trio.lowlevel.FdStream(r) as rstream:
async with trio.lowlevel.FdStream(w) as wstream:
with trio.move_on_after(1):
print(await wstream.send_all(b))
async for data in rstream:
print(len(data))
if __name__ == "__main__":
trio.run(main) $ python pipes.py
65536
$ And another example of subprocess: # proc.py
import subprocess
from functools import partial
import trio
# trio at 356db30e901fcde82b8fd0acdd3c109ca61e2156 (2021.7.14) or later
async def main():
async with trio.open_nursery() as nursery:
proc = await nursery.start(partial(
trio.run_process,
["/bin/cat"],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
))
b = b"a" * (65536 * 16)
async with proc.stdin:
with trio.move_on_after(1):
await proc.stdin.send_all(b)
async with proc.stdout:
async for data in proc.stdout:
print(len(data))
if __name__ == "__main__":
trio.run(main) $ python proc.py
65536
65536
65536
$ The similar thing happens on # sockets.py
from functools import partial
import trio
from trio.testing import open_stream_to_socket_listener
async def handler(event, cancel_scope, stream):
await event.wait()
async for data in stream:
print(len(data))
cancel_scope.cancel()
async def main():
b = b"a" * (65536 * 16)
event = trio.Event()
async with trio.open_nursery() as nursery:
listeners = await nursery.start(
trio.serve_tcp,
partial(handler, event, nursery.cancel_scope),
0
)
async with await open_stream_to_socket_listener(listeners[0]) as stream:
with trio.move_on_after(1):
await stream.send_all(b)
event.set()
if __name__ == "__main__":
trio.run(main) $ python sockets.py
65536
32687
$ So, for this case, async for data in recv_stream:
with trio.move_on_after(0.1):
send_stream.send_all(data)
# update the progress bar every 0.1s
progress_bar.update()
# We don't know how many bytes are sent, so we can't even retry it. In my opinion, a new class PartialSendStream(SendStream):
@abstractmethod
async def send_some(data: ReadableBuffer) -> int: ...
@abstractmethod
def send_some_nowait(data: ReadableBuffer) -> int: ...
For example: async for data in recv_stream:
data = memoryview(data)
remaining = len(data)
while remaining:
with trio.move_on_after(0.1) as cancel_scope:
sent = await partial_send_stream.send_some(data[-remaining:])
# consume its return value before the next checkpoint
remaining -= sent
await do_some_other_checkpoints()
progress_bar.update()
Morevoer, I think a class PartialSendStream(SendStream):
@abstractmethod
async def wait_send_some_wont_block(at_least: int) -> None: ... It's like If However, there is a problem that how to name the "partial-sendable" version of By the way, a |
Recently we were discussing what happens if a subprocess call gets cancelled, but you want to at least find out what the subprocess said before it got killed. And @oremanj wrote:
It's a fair point! The "cancelled operation didn't happen" thing was only ever supposed to apply to low-level, primitive operations. In that context, it's a pretty important rule, because without it you can't ever hope to build anything sensible on top. But it's never made any sense for higher-level operations (i.e., the ones that working programmers are actually interacting with 99.99% of the time). Of course, at the time the initial docs were being written, I was struggling to figure out how to get the primitive operations to work at all and there were no higher-level operations. So that rule probably gets more prominence then it should :-). But things have changed and we should have a better story here.
Recently in a discussion of how to talk about cancellation in the docs, @smurfix wrote:
So that's one idea for how Trio could provide concrete advice to users about how to work with partial cancellation.
I don't have any organized thoughts here, so I'm just going to dump a bunch of unorganized ones.
There were two concrete proposals that @oremanj made in the subprocess discussion (unless there were more and I'm forgetting some :-)):
Add
timeout
anddeadline
arguments totrio.run_process
. These would have a similar effect to wrapping a cancel scope aroundrun_process
, except that if the timeout expires, thenrun_process
wouldn't raiseCancelled
, it would raiseCalledProcessError
, which would be a special exception with attributes recording whatever partial output, return code, etc., we got from the process.The downside of this is that it's extremely specific to subprocesses, which feels weird. The problem is really "what do you do if an operation times out and you want partial results?" – I actually have no idea what makes subprocesses special here, as compared to, I don't know, calling some docker API or something. So a solution that's specific to subprocesses doesn't feel natural. OTOH it would work, and maybe there's some reason that people need partial results from subprocesses a lot, and don't in other cases, so something simple and specific is fine.
Give
run_process
a special (optional) semantics, where if while running it say aCancelled
exception materialize, it would automatically replace it withCalledProcessError
.This is a really intriguing idea, but makes me uncomfortable because we have no idea where that
Cancelled
is coming from – in particular, we don't know whether the code that was going to process the partial results is also cancelled, or not.I don't actually know why @oremanj is so eager to get at partial results in this case; I gather he has some use case where he needs this feature, but I don't know what it is.
Another notorious example where cancellation loses information in an important way is
Stream.send_all
. Right now, ifsend_all
gets cancelled, you effectively have to throw away that stream and give up, because you have no idea what data you have or haven't sent.It wasn't always like this: originally, if
send_all
was cancelled, there was a hack where we'd attach an attribute to theCancelled
exception recording how many bytes we'd sent, and a sufficiently clever caller could potentially use that to reconstruct the state of the stream.Then I added
SSLStream
and it quickly became clear that this design was no good. There are two major issues:exceptions may start out in some nice well-defined operation like
SocketStream.send_all
, but they propagate. That's what exceptions do! Right across abstraction boundaries. So, for example, if you calledSSLStream.send_all
, and it calledSocketStream.send_all
, then if you weren't careful then you could get an exception out ofSSLStream.send_all
that has metadata attached saying how many bytesSocketStream.send_all
sent, which is catastrophically misleading.SSLStream
actually has some pretty complicated internal state, because, well, you know. Cryptography. In particular, cancellation is very different: with something likeSocketStream
, ifsend_all
is cancelled in the middle, that's pretty simple: you sent the first N bytes, but not the rest. WithSSLStream
, though, thensend_all
immediately commits to sending all the bytes, before it sends any of them. So if it gets cancelled, then we're in this weird state where it's sent some of the bytes, but it's committed to sending the rest of the bytes, but it hasn't yet. Oh, and we don't even know how many user-level bytes have actually been transmitted in a way that the other side can read them. (Like, we might know sent 500 bytes on the underlying socket, but maybe 100 of those are protocol framing, and then the last 50 are actual application data but it's application data that the other side can't decrypt until we send another 50 bytes to complete that frame, ... it's really messy.) There just is no useful way to communicate the state of anSSLStream
aftersend_all
is cancelled, no matter what metadata we attach to what exceptions.So, instead, we've been going ahead with the rule that once a
send_all
is cancelled, your stream is doomed. We haven't done anything to detect this and e.g. raise an error if you try callingsend_all
again after a cancelledsend_all
, like in @smurfix's suggestion.... maybe we should?And then as a consequence, for downstream users, like trio-websocket, what we've been converging on is basically the rule that only one task should "own" a
Stream
for sending at a time – if you want to a stream to survive sending from multiple tasks, then you create a background task that handles thesend_all
calls, and the other tasks send stuff to that task over some kind of channel. As @mehaase recently pointed out in #328 (comment), we might want to start documenting this more thoroughly? (#328 is generally relevant to these issues – it's ostensibly aboutsend_all
and locking, but really it's about sharing a stream between multiple tasks, and cancellation turns out to be a major consideration there.)This does seem to be working out pretty well. So I guess the moral is that at least in this area, "partial results" just aren't an important case to think about. All the cases we care about are either "leaves the state inconsistent" or "atomic", and you can build the latter on top of the former (!) by using a background task + a channel, b/c the channel's
send
operation is atomic.Some of this comment also feels relevant, especially the bit about "what does cancellation mean" near the end: #147 (comment)
The text was updated successfully, but these errors were encountered: