Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful stop of the container is not handled correctly #311

Open
3 tasks done
de666 opened this issue Jan 3, 2024 · 13 comments
Open
3 tasks done

Graceful stop of the container is not handled correctly #311

de666 opened this issue Jan 3, 2024 · 13 comments

Comments

@de666
Copy link

de666 commented Jan 3, 2024

Support guidelines

I've found a bug and checked that ...

  • ... the documentation does not mention anything about my problem
  • ... there are no open or closed issues that are related to my problem

Description

When stopping the container with docker compose stop, the container shuold stop gracefully but this doesn't happen.

Expected behaviour

In the doc it is reported:

After issuing a shutdown command, Docker waits 10 seconds for the container to exit before it is killed

Actual behaviour

Docker it waits always 10 seconds (or the number of seconds specified in stop_grace_period), not less, regardless of how many torrents are active/seeding etc.
Even with no torrents at all, docker waits always 10 seconds (or the number of seconds specified in stop_grace_period) before considering the container as stopped.

Steps to reproduce

  1. Follow the docker compose guide to start the contianer
  2. Stop the contianer after about 60s (with empty torrent list)

Docker info

Client: Docker Engine - Community
 Version:    24.0.7
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.21.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 8
  Running: 6
  Paused: 0
  Stopped: 2
 Images: 18
 Server Version: 24.0.7
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 3dd1e886e55dd695541fdcd67420c2888645a495
 runc version: v1.1.10-0-g18a0cb0
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.0-91-generic
 Operating System: Ubuntu 22.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 5.63GiB
 Name: abcd
 ID: 0b09bdd9-fcef-4637-97cc-f1139efa48f4
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Docker Compose config

No response

Logs

rtorrent-rutorrent-logs  | tail: can't open '/log/*.log': No such file or directory
rtorrent-rutorrent-logs  | tail: no files
rtorrent-rutorrent-logs  | tail: can't open '/log/*.log': No such file or directory
rtorrent-rutorrent-logs  | tail: no files
rtorrent-rutorrent-logs  | 1704304362 I listen port 51413 opened with backlog set to 128
rtorrent-rutorrent-logs  | 1704304362 D input history file read (path:/data/rtorrent/.session/rtorrent.input_history)
rtorrent-rutorrent-logs  | 1704304362 N rtorrent disk: Starting thread.
rtorrent-rutorrent-logs  | 1704304362 N rtorrent main: Starting thread.
rtorrent-rutorrent-logs  | 1704304362 N rtorrent scgi: Starting thread.
rtorrent-rutorrent-logs  | 1704304362 I handshake_manager->45.128.37.101: Adding incoming connection: fd:19.
rtorrent-rutorrent-logs  | 1704304362 I handshake_manager->45.128.37.101: Received error: message:6 unknown download.
rtorrent-rutorrent-logs  | 1704304367 I handshake_manager->45.128.37.101: Adding incoming connection: fd:19.
rtorrent-rutorrent-logs  | 1704304367 I handshake_manager->45.128.37.101: Received error: message:6 unknown download.
rtorrent-rutorrent-logs  | 1704304370 I handshake_manager->45.128.37.101: Adding incoming connection: fd:19.
rtorrent-rutorrent-logs  | 1704304370 I handshake_manager->45.128.37.101: Received error: message:6 unknown download.
rtorrent-rutorrent-logs  | 1704304388 I handshake_manager->45.128.37.101: Adding incoming connection: fd:19.
rtorrent-rutorrent-logs  | 1704304388 I handshake_manager->45.128.37.101: Adding incoming connection: fd:20.
rtorrent-rutorrent-logs  | 1704304388 I handshake_manager->45.128.37.101: Received error: message:6 unknown download.
rtorrent-rutorrent-logs  | 1704304388 I handshake_manager->45.128.37.101: Received error: message:6 unknown download.
rtorrent-rutorrent-logs  | 1704304455 I handshake_manager->45.128.37.101: Adding incoming connection: fd:19.
rtorrent-rutorrent-logs  | 1704304455 I handshake_manager->45.128.37.101: Received error: message:6 unknown download.
rtorrent-rutorrent-logs  | 1704304456 I handshake_manager->79.117.130.64: Adding incoming connection: fd:19.
rtorrent-rutorrent-logs  | 1704304456 I handshake_manager->79.117.130.64: Received error: message:6 unknown download.
rtorrent-rutorrent-logs  | 1704304469 I handshake_manager->189.177.61.64: Adding incoming connection: fd:19.
rtorrent-rutorrent-logs  | 1704304469 I handshake_manager->189.177.61.64: Received error: message:6 unknown download.
rtorrent-rutorrent-logs  | 1704304500 I handshake_manager->193.29.107.170: Adding incoming connection: fd:19.
rtorrent-rutorrent-logs  | 1704304500 I handshake_manager->193.29.107.170: Received error: message:6 unknown download.
rtorrent-rutorrent-logs  | 1704304517 I handshake_manager->79.117.130.64: Adding incoming connection: fd:19.
rtorrent-rutorrent-logs  | 1704304517 I handshake_manager->79.117.130.64: Received error: message:6 unknown download.
rtorrent-rutorrent-logs  | 1704304518 I handshake_manager->79.117.130.64: Adding incoming connection: fd:19.
rtorrent-rutorrent-logs  | 1704304518 I handshake_manager->79.117.130.64: Received error: message:6 unknown download.
rutorrent                | [s6-init] making user provided files available at /var/run/s6/etc...exited 0.
rutorrent                | [s6-init] ensuring user provided files have correct perms...exited 0.
rutorrent                | [fix-attrs.d] applying ownership & permissions fixes...
rutorrent                | [fix-attrs.d] done.
rutorrent                | [cont-init.d] executing container initialization scripts...
rutorrent                | [cont-init.d] 00-fix-logs.sh: executing... 
rutorrent                | [cont-init.d] 00-fix-logs.sh: exited 0.
rutorrent                | [cont-init.d] 01-fix-uidgid.sh: executing... 
rutorrent                | [cont-init.d] 01-fix-uidgid.sh: exited 0.
rutorrent                | [cont-init.d] 02-fix-perms.sh: executing... 
rutorrent                | Fixing perms...
rutorrent                | [cont-init.d] 02-fix-perms.sh: exited 0.
rutorrent                | [cont-init.d] 03-config.sh: executing... 
rutorrent                | Setting timezone to Europe/Rome...
rutorrent                | Setting PHP-FPM configuration...
rutorrent                | Setting PHP INI configuration...
rutorrent                | Setting OpCache configuration...
rutorrent                | Setting Nginx configuration...
rutorrent                | Setting Nginx XMLRPC over SCGI configuration...
rutorrent                | Setting Nginx ruTorrent configuration...
rutorrent                | Setting Nginx WebDAV configuration...
rutorrent                | Update healthcheck script...
rutorrent                | Initializing files and folders...
rutorrent                | rpc.htpasswd is empty, removing authentication...
rutorrent                | rutorrent.htpasswd is empty, removing authentication...
rutorrent                | webdav.htpasswd is empty, removing authentication...
rutorrent                | Checking rTorrent local configuration...
rutorrent                | Checking rTorrent configuration...
rutorrent                | Bootstrapping ruTorrent configuration...
rutorrent                | Removing core plugin httprpc...
rutorrent                | Removing core plugin ipad...
rutorrent                | Removing core plugin xmpp...
rutorrent                | Setting custom config for create plugin...
rutorrent                | Checking ruTorrent custom plugins...
rutorrent                | Checking ruTorrent plugins configuration...
rutorrent                | Checking ruTorrent custom themes...
rutorrent                | Setting GeoIP2 databases for geoip2 plugin...
rutorrent                | Fixing perms...
rutorrent                | [cont-init.d] 03-config.sh: exited 0.
rutorrent                | [cont-init.d] 04-create-services.sh: executing... 
rutorrent                | [cont-init.d] 04-create-services.sh: exited 0.
rutorrent                | [cont-init.d] ~-socklog: executing... 
rutorrent                | [cont-init.d] ~-socklog: exited 0.
rutorrent                | [cont-init.d] done.
rutorrent                | [services.d] starting services
rutorrent                | 2024/01/03 18:52:42 [notice] 512#512: using the "epoll" event method
rutorrent                | 2024/01/03 18:52:42 [notice] 512#512: nginx/1.24.0
rutorrent                | 2024/01/03 18:52:42 [notice] 512#512: OS: Linux 5.15.0-91-generic
rutorrent                | 2024/01/03 18:52:42 [notice] 512#512: getrlimit(RLIMIT_NOFILE): 32000:40000
rutorrent                | 2024/01/03 18:52:42 [notice] 512#512: start worker processes
rutorrent                | 2024/01/03 18:52:42 [notice] 512#512: start worker process 539
rutorrent                | 2024/01/03 18:52:42 [notice] 512#512: start worker process 540
rutorrent                | [services.d] done.
rutorrent                | [03-Jan-2024 18:52:42] NOTICE: fpm is running, pid 514
rutorrent                | [03-Jan-2024 18:52:42] NOTICE: ready to handle connections
rutorrent                | [03-Jan-2024 18:55:25] NOTICE: Terminating ...
rutorrent                | 2024/01/03 18:55:25 [notice] 512#512: signal 15 (SIGTERM) received from 506, exiting
rutorrent                | 2024/01/03 18:55:25 [notice] 539#539: exiting
rutorrent                | 2024/01/03 18:55:25 [notice] 540#540: exiting
rutorrent                | [03-Jan-2024 18:55:25] NOTICE: exiting, bye-bye!
rutorrent                | 2024/01/03 18:55:25 [notice] 539#539: exit
rutorrent                | 2024/01/03 18:55:25 [notice] 540#540: exit
rutorrent                | 2024/01/03 18:55:25 [notice] 512#512: signal 17 (SIGCHLD) received from 539
rutorrent                | 2024/01/03 18:55:25 [notice] 512#512: worker process 539 exited with code 0
rutorrent                | 2024/01/03 18:55:25 [notice] 512#512: signal 29 (SIGIO) received
rutorrent                | 2024/01/03 18:55:25 [notice] 512#512: signal 17 (SIGCHLD) received from 540
rutorrent                | 2024/01/03 18:55:25 [notice] 512#512: worker process 540 exited with code 0
rutorrent                | 2024/01/03 18:55:25 [notice] 512#512: exit

Additional info

see #311 (comment)

@stickz
Copy link
Contributor

stickz commented Jan 3, 2024

Could you try to override system.daemon.set = true in your .rtorrent.rc file? Does it resolve your problem by setting the value to false? Please keep in mind, this may not work.

@boomboxnation
Copy link

boomboxnation commented Jan 3, 2024

I always assumed that was a docker setting, see stop_grace_period and stop_signal in the docker references. Those may be what you are looking/needing to configure. (I only set a grace_period for mine...I'm just not willing to wait more than the 5m I set it to, as long as I am not repeatedly restarting rtorrent, my trackers sluff off the ungraceful disconnect)

Edit...now that I re-read the documentation. It's confusing. stop_signal says the default is SIGTERM (Nice/Graceful) and stop_grace_period says time before SIGKILL. In my experience...stop sends SIGKILL, not SIGTERM by default.

@de666
Copy link
Author

de666 commented Jan 4, 2024

Could you try to override system.daemon.set = true in your .rtorrent.rc file? Does it resolve your problem by setting the value to false? Please keep in mind, this may not work.

it is the first thing I tried, it was set to true and when I set it to false it didn't work because rtorrent tries to find some terminal (if launched in foreground it is supposed that shuold be something to output) and it doesn't find it so it fills the log with errors.

I always assumed that was a docker setting, see stop_grace_period and stop_signal in the docker references. Those may be what you are looking/needing to configure. (I only set a grace_period for mine...I'm just not willing to wait more than the 5m I set it to, as long as I am not repeatedly restarting rtorrent, my trackers sluff off the ungraceful disconnect)

Edit...now that I re-read the documentation. It's confusing. stop_signal says the default is SIGTERM (Nice/Graceful) and stop_grace_period says time before SIGKILL. In my experience...stop sends SIGKILL, not SIGTERM by default.

rtorrent shuold receive SIGINT to gracefully stop but also SIGTERM is ok.
From what I know, docker sends SIGTERM first and the waits stop_grace_period for the process to stop, if it doesn't stop, it force the process to stop killing it with SIGKILL (which obviously works always).
As I said before, If I set 60s of stop_grace_period, docker waits for 60s and it is not possbile rtorrent process has to wait so long to shutdown properly without any torrent in the torrent list, it shuold terminate almost immediately (a couple of seconds).
It is possible that is something related to s6 overlay which doesn't understand the process was terminated and so it doesn't terminate itself?
If you look at this old docker image from linuxserver.io (the one I used before switching to this updated image), you see they execute rtorrent inside a screen session and the trap the TERM signal to forward the TERM signal to rtorrent process.
https://github.com/linuxserver/docker-rutorrent/blob/78113ffc2069d63fb4e1e012d4c4d4dba665f05c/root/etc/services.d/rutorrent/run

@stickz
Copy link
Contributor

stickz commented Jan 4, 2024

@de666 Thanks so much for the additional information. I will make a pull request for this docker container with the environment variable you can configure to run rTorrent in the foreground. It will use a Linux screen session.

@de666
Copy link
Author

de666 commented Jan 4, 2024

... It will use a Linux screen session.

Thanks!
Just for the record, I slightly modified the service file I linked before the following way:

#!/usr/bin/with-contenv bash

_term() {
  echo "[rtorrent] Caught SIGTERM signal!"
  screen -S rtorrent -X xon
  sleep 5s
  killall -TERM rtorrent 2>/dev/null
}

trap _term SIGTERM

screen -D -m -S \
rtorrent s6-setuidgid abc /usr/bin/rtorrent \
-n -o import=/config/rtorrent/rtorrent.rc &

wait

This tries to stop rtorrent process gracefully sending control-q to the screen session before sending the SIGTERM
Obviously it has to be adapted with the correct config file and user

stickz added a commit to stickz/docker-rtorrent-rutorrent that referenced this issue Jan 4, 2024
This pull requests add the `RT_DAEMON_MODE` environment variable which allows the user to configure how rTorrent is launched.

rTorrent is still launched in daemon mode by default unless the user changes it. When daemon mode is disabled, rTorrent will launch in the foreground on a linux screen instead.

This has some graceful exit benefits in limited circumstances. See crazy-max#311
stickz added a commit to stickz/docker-rtorrent-rutorrent that referenced this issue Jan 4, 2024
This pull requests add the `RT_DAEMON_MODE` environment variable which allows the user to configure how rTorrent is launched.

rTorrent is still launched in daemon mode by default unless the user changes it. When daemon mode is disabled, rTorrent will launch in the foreground on a linux screen instead.

This has some graceful exit benefits in limited circumstances. See crazy-max#311
@de666
Copy link
Author

de666 commented Jan 9, 2024

@stickz in the meanwhile I've made some other test running the container totally disabling the rtorrent service and I realize the problem it is not related to that service. It seems all the services are shutted down properly by s6 and s6-svscan knows it (if you put a finish script for each service it is correctly executed).
So it seems the problem is that for some reason s6-svscan (PID 1) doesn't exit and so it has to be killed by docker.
Maybe is a bug of that s6 overlay version (I saw it is not the last one)?
This is just to tell you I'm not yet so sure the pull reuqest is still needed or usefull in some way (didn't try that yet but I don't think it solves the issue).

@stickz
Copy link
Contributor

stickz commented Jan 9, 2024

Thanks, I will close off that PR.

@de666 de666 changed the title Graceful stop of rtorrent process is not handled correctly Graceful stop of the container is not handled correctly Jan 10, 2024
@crazy-max
Copy link
Owner

I will look to update to latest v3 of s6-overlay

@tsjk
Copy link

tsjk commented Jun 22, 2024

Any progress here? :)

@andreibsk
Copy link

For me all services exit within 1-2 seconds on a fresh image (0 torrents). It's some socklog processes that remain active until Docker forces the container down, tested with a stop_grace_period of 20s.

Before stopping the container:

Mem: 8814528K used, 7207400K free, 1696K shrd, 122816K buff, 6368400K cached
CPU:   0% usr   0% sys   0% nic  99% idle   0% io   0% irq   0% sirq
Load average: 0.01 0.01 0.00 2/564 1088
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
 1065   882 rtorrent S    29596   0%   1   0% {php-fpm82} php-fpm: pool www
 1066   882 rtorrent S    29596   0%   3   0% {php-fpm82} php-fpm: pool www
  882   876 rtorrent S    29588   0%   0   0% {php-fpm82} php-fpm: master process (/etc/php82/php-fpm.conf)
  884   877 rtorrent S    16672   0%   3   0% {rtorrent main} rtorrent -D -o import=/etc/rtorrent/.rtlocal.rc
  915   892 rtorrent S    14488   0%   0   0% nginx: worker process
  919   892 rtorrent S    14488   0%   1   0% nginx: worker process
  916   892 rtorrent S    14488   0%   3   0% nginx: worker process
  917   892 rtorrent S    14488   0%   1   0% nginx: worker process
  892   875 rtorrent S     9120   0%   2   0% nginx: master process nginx -g daemon off;
 1080     0 root     S     1696   0%   3   0% sh
 1088  1080 root     R     1624   0%   2   0% top
  886   878 nobody   S     1184   0%   0   0% s6-log -bp - +^cron\. T /var/log/socklog/cron - +^daemon\. T /var/log/socklog/dae  879     1 root     S     1128   0%   2   0% s6-supervise socklog
  878     1 root     S     1128   0%   1   0% s6-supervise socklog/log
  876     1 root     S     1128   0%   3   0% s6-supervise php-fpm
   38     1 root     S     1128   0%   0   0% s6-supervise s6-fdholderd
  875     1 root     S     1128   0%   2   0% s6-supervise nginx
  877     1 root     S     1128   0%   3   0% s6-supervise rtorrent
    1     0 root     S     1120   0%   1   0% s6-svscan -t0 /var/run/s6/services
  889   879 nobody   S     1112   0%   3   0% socklog unix /dev/log

After about 2-3 seconds:
It stays like this until the 20s mark:

Mem: 8831992K used, 7189936K free, 1696K shrd, 122816K buff, 6374320K cached
CPU:   0% usr   0% sys   0% nic  99% idle   0% io   0% irq   0% sirq
Load average: 0.04 0.02 0.00 1/428 1127
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
 1080     0 root     S     1696   0%   3   0% sh
 1088  1080 root     R     1624   0%   2   0% top
  886   878 nobody   S     1184   0%   0   0% s6-log -bp - +^cron\. T /var/log/socklog/cron - +^daemon\. T /var/log/socklog/dae
  878     1 root     S     1128   0%   1   0% s6-supervise socklog/log
  879     1 root     S     1128   0%   0   0% s6-supervise socklog
    1     0 root     S     1120   0%   0   0% s6-svscan -t0 /var/run/s6/services
  889   879 nobody   S     1112   0%   0   0% socklog unix /dev/log

Logs

...
Setting GeoIP2 databases for geoip2 plugin...
Fixing perms...
[cont-init.d] 03-config.sh: exited 0.
[cont-init.d] 04-create-services.sh: executing...
[cont-init.d] 04-create-services.sh: exited 0.
[cont-init.d] ~-socklog: executing...
[cont-init.d] ~-socklog: exited 0.
[cont-init.d] done.
[services.d] starting services
2024/06/24 16:50:39 [notice] 892#892: using the "epoll" event method
2024/06/24 16:50:39 [notice] 892#892: nginx/1.24.0
2024/06/24 16:50:39 [notice] 892#892: OS: Linux 6.8.9-arch1-2
2024/06/24 16:50:39 [notice] 892#892: getrlimit(RLIMIT_NOFILE): 32000:40000
2024/06/24 16:50:39 [notice] 892#892: start worker processes
2024/06/24 16:50:39 [notice] 892#892: start worker process 915
2024/06/24 16:50:39 [notice] 892#892: start worker process 916
2024/06/24 16:50:39 [notice] 892#892: start worker process 917
2024/06/24 16:50:39 [notice] 892#892: start worker process 919
[services.d] done.
[24-Jun-2024 16:50:39] NOTICE: fpm is running, pid 882
[24-Jun-2024 16:50:39] NOTICE: ready to handle connections
2024/06/24 16:52:11 [notice] 892#892: signal 15 (SIGTERM) received from 875, exiting
[24-Jun-2024 16:52:11] NOTICE: Terminating ...
2024/06/24 16:52:11 [notice] 916#916: exiting
2024/06/24 16:52:11 [notice] 919#919: exiting
2024/06/24 16:52:11 [notice] 915#915: exiting
[24-Jun-2024 16:52:11] NOTICE: exiting, bye-bye!
2024/06/24 16:52:11 [notice] 917#917: exiting
2024/06/24 16:52:11 [notice] 916#916: exit
2024/06/24 16:52:11 [notice] 917#917: exit
2024/06/24 16:52:11 [notice] 915#915: exit
2024/06/24 16:52:11 [notice] 919#919: exit
2024/06/24 16:52:11 [notice] 892#892: signal 17 (SIGCHLD) received from 915
2024/06/24 16:52:11 [notice] 892#892: worker process 915 exited with code 0
2024/06/24 16:52:11 [notice] 892#892: signal 29 (SIGIO) received
2024/06/24 16:52:11 [notice] 892#892: signal 17 (SIGCHLD) received from 916
2024/06/24 16:52:11 [notice] 892#892: worker process 916 exited with code 0
2024/06/24 16:52:11 [notice] 892#892: signal 29 (SIGIO) received
2024/06/24 16:52:11 [notice] 892#892: signal 17 (SIGCHLD) received from 917
2024/06/24 16:52:11 [notice] 892#892: worker process 917 exited with code 0
2024/06/24 16:52:11 [notice] 892#892: worker process 919 exited with code 0
2024/06/24 16:52:11 [notice] 892#892: exit

@tsjk
Copy link

tsjk commented Jun 24, 2024

The -p in s6-log -bp - ... actually tells s6-log to not exit on SIGTERM.
Not sure about how socklog works here...

@de666
Copy link
Author

de666 commented Sep 17, 2024

The -p in s6-log -bp - ... actually tells s6-log to not exit on SIGTERM. Not sure about how socklog works here...

@crazy-max only knows...

@tsjk
Copy link

tsjk commented Sep 17, 2024

Currently I have a script doing something like backgrounding podman-compose down, waitng for rtorrent, nginx and php-fpm to terminate, killing socklog and s6log with -KILL, and finally waiting for podman-compose down to finish. Seems to work ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants