-
-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[discussion] Mechanism for "sharing" tasks between different branches of the task tree #266
Comments
Something this API will want to think about: how to handle TaskLocal inheritance. Probably the rule is just that these tasks don't inherit locals. See also #289. |
See python-triogh-266 This is surprisingly interesting and tricky.
See python-triogh-266 This is surprisingly interesting and tricky.
Would use of this feature (at least for use case 1) imply that the return value is cached for some (configurable) period of time? It would seem a bit arbitrary to me if a call takes exactly 1 second, and if a second call comes 0.99 seconds later, it would return the value for the first call, whereas if it came 1.01 seconds later, it would trigger a new call. If 1.01 seconds is long enough to want a new call, it seems like you'd want the one 0.99 seconds later to trigger a new call as well. Basically, I'm suggesting that the "staleness" threshold be provided by the user rather than using however long the first call takes, which could vary. |
@cjerdonek Whoops, sorry I missed this comment! My assumption was that if you also want some traditional caching, that's something that's pretty easy to implement given a basic coalsed-call primitive? And there are tons of possible caching strategies you might want (fixed size with some replacement policy like LRU or ARC, fixed time like in your comment, ...), so better if our primitive doesn't bake it in? Really I'm not sure though -- I don't know how people will use actually use this if we make it :-). |
In the simplest case, nurseries are a way to make tasks lexically scoped. If you pass nursery objects around (the "escape hatch"), then this gives you arena-scoped tasks (where the arena itself has to be lexically scoped). What we're talking about in this issue is reference-counted tasks (where the references have to be lexically scoped). So basically, all the standard ways to deterministically manage object lifetime also apply to tasks, with the extra proviso that they need to be stack-rooted. Are there any other deterministic heap management strategies I'm forgetting about? Is there any value in having "strong" vs "weak" task sharing? |
Taking a completely different perspective on this issue, there is a library doing something similar for sync IO, dogpile.cache:
The concept of this Lock (even though I think it's kind of a misnomer) might be interesting here, as it provides some abstraction of the sharing of tasks without being biased towards any caching it would be used for. The Lock takes a creator function returning one specific value (with possible arguments already applied as A CacheRegion then combines methods for wrapping function calls, generating keys from arguments, managing the per-key Locks and controlling the execution of tasks together with interchangeable cache backends. It might be helpful to take a more detailed look at its architecture and design, as it provides a very good abstraction of the problem with the presence of a multitude of different caching paradigms. The only thing I'm missing in dogpile.cache (in addition to async support) is the possibility to customize the behaviour when a second execution comes in when a first one is already running: to either return the old value, wait for the current execution, raise a marker exception etc. |
@N-Coder oh awesome, thanks for the link – looking at prior art is super helpful for this kind of discussion |
Another real use case that came up in chat today: an async version of |
I made my own attempt at implementing this functionality as generic as possible and ended up with the following snippet, which is somewhat in between the @attr.s()
class CacheValue(object):
value = attr.ib() # type: Result
creator = attr.ib() # type: Callable
create_lock = attr.ib() # type: Lock
def is_set(self):
return self.value is not None
def is_creating(self):
return self.create_lock.locked()
async def get_or_create(self):
if not self.is_set():
async with self.create_lock:
if not self.is_set():
self.value = await Result.acapture(self.creator())
return self.value.unwrap()
async def reset(self):
async with self.create_lock:
self.value = None Basically, you need some variant of the following functionality:
With all the different use-cases and requirements that may arise I'm unsure whether a really generic implementation in the trio library would be that easy to make. Maybe some examples for basic functionality (like my snippet above, with some more thoughts and explanations put into cancellation) plus libraries for more targeted use-cases (like aiocache, which actually provides a synchronized For my use-case, I'll probably go with some combination of pyfailsafe and the aiocache |
A note on exception handling: #303 (comment) Further note: injected cancellation exceptions will need some careful handling – we don't want to convert them into An interesting feature of (Of course, in real life The core thing that makes this API tricky is that it's fundamentally a reference-counted API, so we need some way to be clear about where we acquire and release references. And when you release a reference, that might need to block waiting for the task to be cancelled. A simple So any implementation needs some way to atomically: find any other instance of the task that we want to join up with, and if one exists then join it, or otherwise create one and register it for others to find. And we need to atomically unregister it when the last task cancels out. So by far the most natural way to do this is to have a single object that holds a dict internally, and combines the (dict lookup, dict mutation, actually running the task) steps into single operations. Which is what #303 does. But maybe it also needs a way to query whether a task is currently running under a given key and then... what? For dogpile it's either launch a task or not. I guess we could have a |
Use case 1: there's an expensive idempotent operation that multiple tasks might want to call. Examples within trio itself would include
getaddrinfo
for a particular host, orWaitForSingleObject
on a single object (as needed for e.g. the windows version ofPopen.wait
). If two tasks make the same call at the same time, then we'd like to coalesce those into a single underlying call and then report the result back to both callers.Use case 2: there are multiple user-level objects that underneath are multiplexed onto a single object. (For example: multiple HTTP/2 channels on top of a single TCP connection.) Now you need the invariant: whenever at least one task is blocked in
await channel.receive_some()
, there needs to be a task reading from the underlying TCP connection -- but there should never be more than one task receiving from the underlying TCP connection. In some cases the right solution to this is probably to create one background task to handle receiving from the TCP connection. For example, in HTTP/2 it's expected that you can respond to pings at any time, even if all the logical channels are paused, so HTTP/2 isn't really a good example here -- the TCP receiver needs to be running constantly for as long as the connection is open. But in other protocols you might want to only receive on the TCP channel when there are tasks waiting for data. (IIRC something like this shows up in ZeroMQ's API, or the SSH protocol has logical channels but I don't think it has any pings.) In this case it's possible in principle to have the first task that arrives inchannel.receive_some
initiate atcp_transport.receive_some
, and later tasks block waiting for it, and then if the first task gets cancelled then it hands off the job of callingtcp_transport.receive_some
to someone else.... but this is super complicated. It would be nice if there were some standard way for all the tasks inchannel.receive_some
to simply "share" a single call totcp_transport.receive_some
(or more realistically:protocol_object.pump_underlying_transport
).The interesting thing about these is that they don't fit well into trio's nursery system, BUT they're still possible to do without violating the invariants that the nursery system was created to enforce -- in particular, if the shared call crashes, then we have somewhere to report that. (Though there might be a minor issue with throwing the same exception into multiple receivers and still getting a sensible traceback -- maybe
Error.unwrap
shouldcopy.copy
the exception object before raising it?)I was originally thinking about something like allowing tasks to be in a special "coparented" state, where multiple nurseries supervise them at the same time. But this gets complicated quickly: you need special cases in the nursery code, and there are issues like... if this is a background task providing a service to the code in this nursery (and also other nurseries), then do we need to make sure that it stays alive as long as the non-shared tasks? So should we shield it from cancellation until after all the other tasks exit? That seems tough.
I think a better idea is: don't model it as a task, just model it as a call (and it might be combined with other identical calls). This is exactly what you want for "use case 1" anyway, not a background task. For "use case 2", there's some subtlety because when the protocol pump receives some data on channel 7, it probably shouldn't exit immediately and wake everyone up; instead it just wants to wake up the task that was waiting for data on channel 7. But this can be done by having that task spawn one task to run the pump, and a second task to wait on a Queue or whatever, and then when the second task gets woken by the pump it cancels the first task, which either leaves the shared task to keep pumping on behalf of other tasks, or else if this was the only task waiting on a channel then it cancels the pump task. (So obviously the pump task needs to be cancel-safe, but that's probably needed anyway.)
Maybe:
Attributes:
call.started
,call.finished
,call.cancelled
(I guess we want to not hold onto the result, and makerun
an error iffinished
orcancelled
?). And we punt to the user to figure out how they want to pass these things between tasks, what key to use to look it up, etc.CoalescableCall
is a terrible name.CombinedRun
?Bonus: doing it this way should actually be pretty straightforward using
spawn_system_task
. (It's sort of likerun_in_trio_thread
.)Prior art / inspiration: Flatt & Findler (2004), Kill-safe synchronization abstractions. They're worried about cases like a queue that has a worker task attached to it, and then is accessed from multiple tasks; if a task that's doing
queue.put
gets cancelled, you don't want this to kill the worker thread or allow the queue object to be gc'ed, because it might leave the object in an inconsistent state. OTOH if both thequeue.put
and thequeue.get
tasks get cancelled, then it's ok (in fact mandatory) to kill the worker thread and gc the queue object. So they define a kind of co-parenting primitive.The text was updated successfully, but these errors were encountered: