让主管正确停止芹菜工人
Make supervisor stop Celery workers correctly
我在使用芹菜的时候遇到了很多奇怪的事情。比如,我更新了tasks.py,supervisorctl reload
(重启),但是tasks出错了。有些任务好像消失了等等。
今天发现因为supervisorctl stop all
无法阻止所有的celery worker。而且只有kill -9 'pgrep python' 才能全部杀掉。
情况:
root@ubuntu12:/data/www/article_fetcher# supervisorctl
celery_beat RUNNING pid 29597, uptime 0:52:18
celery_worker1 RUNNING pid 29556, uptime 0:52:20
celery_worker2 RUNNING pid 29570, uptime 0:52:19
celery_worker3 RUNNING pid 29557, uptime 0:52:20
celery_worker4 RUNNING pid 29586, uptime 0:52:18
uwsgi RUNNING pid 29604, uptime 0:52:18
supervisor> stop all
celery_beat: stopped
celery_worker2: stopped
celery_worker4: stopped
celery_worker3: stopped
uwsgi: stopped
celery_worker1: stopped
supervisor> status
celery_beat STOPPED Aug 04 11:05 AM
celery_worker1 STOPPED Aug 04 11:05 AM
celery_worker2 STOPPED Aug 04 11:05 AM
celery_worker3 STOPPED Aug 04 11:05 AM
celery_worker4 STOPPED Aug 04 11:05 AM
uwsgi STOPPED Aug 04 11:05 AM
进程:
root@ubuntu12:~# ps -aux|grep 'python'
Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html
root 8683 0.0 0.1 61420 11768 ? Ss Aug03 0:27 /usr/bin/python /usr/bin/supervisord
root 29310 0.1 0.1 57120 11344 pts/2 S+ 11:05 0:00 /usr/bin/python /usr/bin/supervisorctl
nobody 29556 2.2 0.5 132484 45988 ? S 11:06 0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody 29557 2.2 0.5 132480 45996 ? S 11:06 0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody 29570 2.4 0.5 132740 45996 ? S 11:06 0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
nobody 29571 26.9 1.4 217688 115804 ? R 11:06 0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody 29572 33.7 0.7 158396 59808 ? R 11:06 0:12 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody 29573 29.6 1.4 215176 115928 ? R 11:06 0:10 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody 29574 27.2 1.4 218244 118180 ? R 11:06 0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
......
......
......
我发现了这个问题:Stopping Supervisor doesn't stop Celery workers,但它问的是不同的东西,接受的答案 supervisorctl stop all
不起作用 actually.So 我决定找到正确的方法。
我查看 supervisor docs 并发现:
killasgroup
If true, when resorting to send SIGKILL to the program to terminate it
send it to its whole process group instead, taking care of its
children as well, useful e.g with Python programs using
multiprocessing.
Default: false
Required: No.
Introduced: 3.0a11
然后我认为每个worker创建4个子进程(由cpu个核心)成为一个进程组,这就是为什么supervisorctl stop all
不工作。
所以我将 killasgroup
添加到 supervisord.conf:
[program:celery_worker1]
; Set full path to celery program if using virtualenv
directory=/data/www/article_fetcher
command=/data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
user=nobody
numprocs=1
stdout_logfile=/data/www/article_fetcher/logs/celery.log
stderr_logfile=/data/www/article_fetcher/logs/celery.log
autostart=true
autorestart=true
startsecs=5
killasgroup=true
.....
.....
那supervisorctl stop all
真的停芹菜工人了!很好~
我在使用芹菜的时候遇到了很多奇怪的事情。比如,我更新了tasks.py,supervisorctl reload
(重启),但是tasks出错了。有些任务好像消失了等等。
今天发现因为supervisorctl stop all
无法阻止所有的celery worker。而且只有kill -9 'pgrep python' 才能全部杀掉。
情况:
root@ubuntu12:/data/www/article_fetcher# supervisorctl
celery_beat RUNNING pid 29597, uptime 0:52:18
celery_worker1 RUNNING pid 29556, uptime 0:52:20
celery_worker2 RUNNING pid 29570, uptime 0:52:19
celery_worker3 RUNNING pid 29557, uptime 0:52:20
celery_worker4 RUNNING pid 29586, uptime 0:52:18
uwsgi RUNNING pid 29604, uptime 0:52:18
supervisor> stop all
celery_beat: stopped
celery_worker2: stopped
celery_worker4: stopped
celery_worker3: stopped
uwsgi: stopped
celery_worker1: stopped
supervisor> status
celery_beat STOPPED Aug 04 11:05 AM
celery_worker1 STOPPED Aug 04 11:05 AM
celery_worker2 STOPPED Aug 04 11:05 AM
celery_worker3 STOPPED Aug 04 11:05 AM
celery_worker4 STOPPED Aug 04 11:05 AM
uwsgi STOPPED Aug 04 11:05 AM
进程:
root@ubuntu12:~# ps -aux|grep 'python'
Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html
root 8683 0.0 0.1 61420 11768 ? Ss Aug03 0:27 /usr/bin/python /usr/bin/supervisord
root 29310 0.1 0.1 57120 11344 pts/2 S+ 11:05 0:00 /usr/bin/python /usr/bin/supervisorctl
nobody 29556 2.2 0.5 132484 45988 ? S 11:06 0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody 29557 2.2 0.5 132480 45996 ? S 11:06 0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody 29570 2.4 0.5 132740 45996 ? S 11:06 0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
nobody 29571 26.9 1.4 217688 115804 ? R 11:06 0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody 29572 33.7 0.7 158396 59808 ? R 11:06 0:12 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody 29573 29.6 1.4 215176 115928 ? R 11:06 0:10 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody 29574 27.2 1.4 218244 118180 ? R 11:06 0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
......
......
......
我发现了这个问题:Stopping Supervisor doesn't stop Celery workers,但它问的是不同的东西,接受的答案 supervisorctl stop all
不起作用 actually.So 我决定找到正确的方法。
我查看 supervisor docs 并发现:
killasgroup
If true, when resorting to send SIGKILL to the program to terminate it send it to its whole process group instead, taking care of its children as well, useful e.g with Python programs using multiprocessing.
Default: false
Required: No.
Introduced: 3.0a11
然后我认为每个worker创建4个子进程(由cpu个核心)成为一个进程组,这就是为什么supervisorctl stop all
不工作。
所以我将 killasgroup
添加到 supervisord.conf:
[program:celery_worker1]
; Set full path to celery program if using virtualenv
directory=/data/www/article_fetcher
command=/data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
user=nobody
numprocs=1
stdout_logfile=/data/www/article_fetcher/logs/celery.log
stderr_logfile=/data/www/article_fetcher/logs/celery.log
autostart=true
autorestart=true
startsecs=5
killasgroup=true
.....
.....
那supervisorctl stop all
真的停芹菜工人了!很好~