Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hooks for user input? #219

Open
dexhorthy opened this issue Jul 9, 2024 · 2 comments
Open

Hooks for user input? #219

dexhorthy opened this issue Jul 9, 2024 · 2 comments
Labels
enhancement Enhance an existing feature

Comments

@dexhorthy
Copy link
Collaborator

dexhorthy commented Jul 9, 2024

Enhancement Description

I'm look to understand the best way to bring the User into a conversation where the interaction is not a CLI / CLI tool. I can think of a few workaround-ish ideas that might work:

  1. use prefect workflow pause/resume to go get the user input and resume with their input
  2. add a tool call like tell_user or ask_user_for_clarification that handles the IO via a websocket or something
  3. set user_input=True on a task, capture/forward stdin/stdout 😬

Use Case

Building autonomous agents for data engineering and data product management -

The interaction paradigms I'd like to be able to support include web-app chat via websockets or other async layer, in addition to more "outer loop" type channels like email, slack, sms, etc. For example, an agent might discover something and want to alert a user, or might complete a long-running task for a user and want to get the user's input.

These sorts of workflows fit nicely into a D(A)G-y sort of state machine that prefect enables, but I'm trying to wrap my head around the best way to fit together these sorts of async and multi-player workflows, or even just some workarounds / patterns that have worked well for applications that have access to LLMs

Proposed Implementation

# open to brainstorming but nothing off the top of my head
@dexhorthy dexhorthy added the enhancement Enhance an existing feature label Jul 9, 2024
@aaazzam
Copy link
Collaborator

aaazzam commented Jul 10, 2024

Hey @dexhorthy.

A resounding :hell_yeah: from us over here. You've nailed the two blessed ways at the moment: pause/resume + make a tool. The former is nice because it lets you sleep agents until they're ready, but the latter is more ergonomic IMO and let's you lean into writing cleaner prompts.

Would love to find a way to make these two easier, where you can sleep an agent (sleeper agent!??!?) workflow until its tool result comes back in.

@dexhorthy
Copy link
Collaborator Author

dexhorthy commented Jul 24, 2024

one additional bit of detail here as we're thinking through this - pause/suspend might work okay if you get a response in under an hour (after which the timeout hits), but the natural way I had implemented this was

flow -> AI task -> tool "get_confirmation_in_slack" -> slack -> set Variable mapping slack_msg_id to flow_run_id, then pause flow with wait_for_input

slack webhook -> fastapi -> lookup flow_run_id with inbound slack_msg_id of the slack thread parent, send_input to the paused flow

the issue there is that you end up with an error because you can't pause a flow if there's a TaskRunContext

2024-07-24 13:08:12,370 - prefect.task_runs - ERROR - Finished in state Failed('Task run encountered an exception RuntimeError: Cannot pause task runs.')

I am currently trying to to rework this a bit

flow -> AI task -> slack, set msg_id
flow -> pause

slack webhook -> fastapi -> lookup flow_run_id as before, resume the paused flow

but the magic of "use the llm to do the reasoning" is a little lost in this case. can't really explain why I want this, maybe the "tool method" is just as you said, "more ergonomic". It eliminates a lot of cognitive overhead in the implementation to tell an LLM "here's a tool that you can use to ask for confirmation/input from a backend user" where the tool itself returns the response from the user, and the LLM can even make decisions about when to ask for approval vs. skip that step based on completeness of context, etc.

Among other things, doing the pause resume at the flow level, outside the LLM, bubbles implementation details down from inside the tool all the way up to the flow. Before, one or more "confirmation" implementations could be neatly bundled up in a modular tool and glued arbitrarily to the webhook receiver endpoint (e.g. before exploring flow pause/resume, we did this with an in-memory Queue that the flow, tasks, and llm knew nothing about)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhance an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants