Add a way for agents to monitor and optionally interrupt long-running commands #348

bjsi · 2024-12-18T07:47:31Z

Some github issues involve debugging commands that hang. When writing code to reproduce and fix these kinds of issues, an autonomous agent needs a way to monitor and optionally interrupt the command if necessary.
After running a particular shell command, every 10 seconds or so it should add a message to the conversation log giving some context about what the command is doing, whether it is still running etc and asking whether it wants to kill it.

bjsi · 2024-12-18T14:22:44Z

From Jake:

Current Understanding

Issue 348 requests a way for agents to monitor and optionally interrupt long-running commands.
The execute_shell function in gptme/tools/shell.py is the main entry point for executing shell commands.
- It uses get_shell_command to construct the command string.
- The actual execution is done by execute_shell_impl.
execute_shell_impl function:
- Uses a ShellSession object to run the command.
- Captures stdout and stderr.
- Formats the output into a message.
- Yields a system message with the command output.
The ShellSession class manages the shell environment:
- It uses subprocess.Popen to create a shell process.
- The run method executes commands in this shell.
The LogManager class in gptme/logmanager.py is responsible for managing the conversation log.
- The append method is used to add new messages to the log.
There's currently no implementation for monitoring long-running commands or interrupting them.
The issue suggests adding messages to the conversation log every 10 seconds for long-running commands.

Questions to Investigate

How can we modify execute_shell_impl to monitor the duration of running commands?
What's the best way to implement a timer for periodic updates (every 10 seconds) during command execution?
How can we add these periodic status updates to the conversation log using LogManager?
What information should be included in the periodic status updates (e.g., execution time, current status)?
How can we implement an interruption mechanism for long-running commands?
Where in the code should we add the logic to prompt the user about potentially killing a long-running command?
How can we modify the ShellSession.run method to support monitoring and potential interruption?
What's the best way to handle the interruption if the user decides to kill the command?
How can we ensure that the monitoring and interruption features don't significantly impact performance for short-running commands?

ErikBjare · 2024-12-18T15:29:38Z

You can sort of do this by forcing/instructing the agent to use the tmux tool instead and then "undo" when it's not ready (as in the https://github.com/ErikBjare/gptme/blob/master/scripts/treeofthoughts.py script) since otherwise it'll eat tokens like crazy.

ErikBjare · 2024-12-18T15:29:58Z

@gptme read some relevant source files and tell us what you think

github-actions · 2024-12-18T15:31:07Z

Based on the analysis of the gptme/tools/shell.py and gptme/tools/tmux.py files, I recommend using the tmux tool as the primary approach for implementing monitoring and optional interruption of long-running commands. Here's a summary of the proposed solution:

Leverage the existing tmux tool functionality:
- new_session: Start long-running commands in isolated tmux sessions.
- inspect_pane: Periodically check the status and output of commands.
- send_keys: Send interruption signals if needed.
- kill_session: Forcefully terminate sessions if necessary.
Implement a new monitor_session function in the tmux tool:
- Periodically inspect the pane content of a given session.
- Yield status updates to the conversation log every 10 seconds.
- Check for command completion or the need for interruption.
- Allow user-triggered interruption for long-running commands.
Modify the execute_tmux function to handle the new monitor_session command.
Update the tmux tool's instructions and examples to include the new monitoring functionality.

Advantages of this approach:

Leverages existing tmux functionality for process isolation and management.
Allows for easy monitoring of multiple concurrent long-running commands.
Provides a flexible solution that can be extended for various use cases.
Integrates well with the existing conversation log system.

Implementation steps:

Add the monitor_session function to gptme/tools/tmux.py.
Modify the execute_tmux function to handle the new command.
Update the tool's instructions and examples.
Add appropriate error handling and edge case management.

This solution addresses the requirements outlined in Issue #348 while maintaining the flexibility and power of the existing tmux tool. It provides a robust way for agents to monitor and optionally interrupt long-running commands, enhancing the overall capabilities of the gptme system.

ErikBjare added the enhancement New feature or request label Dec 18, 2024

bjsi pushed a commit to bjsi/gptme that referenced this issue Dec 28, 2024

Add issue ErikBjare#348 details to Current Understanding

64fecfa

bjsi pushed a commit to bjsi/gptme that referenced this issue Dec 28, 2024

Add issue ErikBjare#348 details to Current Understanding

2061c24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a way for agents to monitor and optionally interrupt long-running commands #348

Add a way for agents to monitor and optionally interrupt long-running commands #348

bjsi commented Dec 18, 2024

bjsi commented Dec 18, 2024

ErikBjare commented Dec 18, 2024

ErikBjare commented Dec 18, 2024

github-actions bot commented Dec 18, 2024

Add a way for agents to monitor and optionally interrupt long-running commands #348

Add a way for agents to monitor and optionally interrupt long-running commands #348

Comments

bjsi commented Dec 18, 2024

bjsi commented Dec 18, 2024

Current Understanding

Questions to Investigate

ErikBjare commented Dec 18, 2024

ErikBjare commented Dec 18, 2024

github-actions bot commented Dec 18, 2024