Upstart 任务在成功完成后挂起
Upstart task hangs after it finishes successfully
我有一个 Upstart 任务,它根据 Starting multiple upstart instances automatically and Restarting Upstart instance processes 启动一个服务的多个实例。它正在工作并且它启动了所有实例但是在它成功启动它们之后它只是挂起。如果我 Ctrl-C
退出然后使用 service status
或查看 ps
检查实例,它们都已成功启动,所以我不知道它挂起时在做什么。
这是我的脚本:
description "all-my-workers"
start on runlevel [2345]
task
console log
env NUM_INSTANCES=1
env STARTING_PORT=42002
pre-start script
for i in `seq 1 $NUM_INSTANCES`;
do
start my-worker N=$i PORT=$(($STARTING_PORT + $i))
done
end script
当我这样做时 service start all-my-workers
我得到这个:
vagrant@vagrant-service:/etc/init$ sudo service all-my-workers start
然后它就挂在那里,不再提示我。正如我所说,我可以 Ctrl-C
出去看看 运行 工人:
vagrant@vagrant-service:/etc/init$ sudo service all-my-workers status
all-my-workers start/running
vagrant@vagrant-service:/etc/init$ sudo service my-worker status N=1
my-worker (1) start/running, process 21938
在 ps
中:
worker 21938 0.0 0.1 4392 612 ? Ss 21:46 0:00 /bin/sh -e /proc/self/fd/9
worker 21941 0.2 7.3 174076 27616 ? Sl 21:46 0:00 python /var/lib/my-system/script/start_worker.py
我认为问题不在 my-worker.conf
但以防万一:
description "my-worker"
stop on stopping all-my-workers
setuid worker
setgid worker
respawn
instance $N
console log
env SCRIPT_PATH="/var/lib/my-system/script/"
script
export PROVIDER=vagrant
export REGION=all
export ENVIRONMENT=cert
. /var/lib/my-system/.virtualenvs/my-system/bin/activate
python $SCRIPT_PATH/start_worker.py
END
end script
非常感谢!
我该如何解决?
我假设 my-worker
是一个 long-lived 进程,并且您希望有任何简单的方法来启动和拆除 my-worker
的多个并行实例。
如果是这种情况,您可能不希望all-my-workers
成为task
。您需要以下内容:
description "all-my-workers"
start on runlevel [2345]
console log
env NUM_INSTANCES=1
env STARTING_PORT=42002
pre-start script
for i in `seq 1 $NUM_INSTANCES`;
do
start my-worker N=$i PORT=$(($STARTING_PORT + $i))
done
end script
pre-stop script
for i in `seq 1 $NUM_INSTANCES`;
do
stop my-worker N=$i PORT=$(($STARTING_PORT + $i)) || true
done
end script
然后您可以 运行 start all-my-workers
启动所有 my-worker
实例,然后 运行 stop all-my-workers
停止它们。实际上,all-my-workers
变成了一个 parent 作业来管理它的 child 作业的启动和停止。
为什么?
您引用了两个 SO 答案,展示了 parent 工作管理 child 工作的想法。他们显示:
- 一个 任务 有一个
script
节
- 一个 工作 有一个
pre-start
节
你的 parent 工作是一个 任务 有一个 pre-start
节,这就是你 运行 陷入这种奇怪行为的原因.
脚本 vs pre-start
来自this Ask Ubuntu answer which cites this deprecated documentation,有两个非常重要的陈述(强调):
All job files must have either an exec or script stanza. This specifies what will be run for the job.
Additional shell code can be given to be run before or after the binary or script specified with exec or script. These are not expected to start the process, in fact, they can't. They are intended for preparing the environment and cleaning up afterwards.
总而言之,由 pre-start
节产生的任何后台进程都被 Upstart 忽略(即不监视)。相反,您 必须 使用 exec
或 script
来生成 Upstart 将监视的进程。
如果省略 exec
/script
节会怎样?新贵将坐下来等待一个进程被产生。因此,您不妨编写一个 while-true 循环:
script
while true; do
true
done
end script
唯一的区别是 while-true 循环是 live-lock 而空节导致 dead-lock.
工作与任务
了解以上内容后,the Upstart documentation for tasks 终于将我们引向了正在发生的事情:
Without the 'task' keyword, the events that cause the job to start will be unblocked as soon as the job is started. This means the job has emitted a starting(7) event, run its pre-start, begun its script/exec, and post-start, and emitted its started(7) event.
With task, the events that lead to this job starting will be blocked until the job has completely transitioned back to stopped. This means that the job has run up to the previously mentioned started(7) event, and has also completed its post-stop, and emitted its stopped(7) event.
(如果您阅读有关 starting and stopping jobs 的文档,有关事件和状态的一些细节会更有意义)。
简单来说:
- 对于正常的 Upstart 作业,
exec
/script
节预计会无限期阻塞,因为它正在启动 long-lived 进程。因此,Upstart 在完成 pre-start
节后停止阻塞。
- 对于
task
,exec
/script
节预计会阻塞 "finite" 时间段,因为它正在启动 short-lived 进程。因此,Ubstart 阻塞直到 after exec
/script
节完成。
但是如果没有 exec
/script
节会怎样? Upstart 会无限期地等待某些东西的发布,但这 永远不会发生 。
- 在
job
的情况下,这很好,因为 Upstart 在等待进程生成时不会阻塞,并且调用 stop
是 apparently 足以使它停止等待。
- 但是,在
task
的情况下,暴发户将永远坐着挂着——或者直到你打断它。但是,因为它仍然没有找到派生的进程,所以技术上它仍然是 运行ning。这就是为什么您可以在中断后查询状态并查看 all-my-workers start/running
.
为了兴趣
如果出于某种原因,你真的想把你的 parent 工作变成一项任务,你实际上需要 两个 任务:一个用于启动 my-worker
个实例和一个来阻止它们。您还需要从 my-worker
.
中删除 stop on stopping all-my-workers
节
start-all-my-workers:
description "starts all-my-workers"
start on runlevel [2345]
task
console log
env NUM_INSTANCES=1
env STARTING_PORT=42002
script
for i in `seq 1 $NUM_INSTANCES`;
do
start my-worker N=$i PORT=$(($STARTING_PORT + $i))
done
end script
stop-all-my-workers:
description "stops all-my-workers"
start on runlevel [!2345]
task
console log
env NUM_INSTANCES=1
env STARTING_PORT=42002
script
for i in `seq 1 $NUM_INSTANCES`;
do
stop my-worker N=$i PORT=$(($STARTING_PORT + $i)) || true
done
end script
我有一个 Upstart 任务,它根据 Starting multiple upstart instances automatically and Restarting Upstart instance processes 启动一个服务的多个实例。它正在工作并且它启动了所有实例但是在它成功启动它们之后它只是挂起。如果我 Ctrl-C
退出然后使用 service status
或查看 ps
检查实例,它们都已成功启动,所以我不知道它挂起时在做什么。
这是我的脚本:
description "all-my-workers"
start on runlevel [2345]
task
console log
env NUM_INSTANCES=1
env STARTING_PORT=42002
pre-start script
for i in `seq 1 $NUM_INSTANCES`;
do
start my-worker N=$i PORT=$(($STARTING_PORT + $i))
done
end script
当我这样做时 service start all-my-workers
我得到这个:
vagrant@vagrant-service:/etc/init$ sudo service all-my-workers start
然后它就挂在那里,不再提示我。正如我所说,我可以 Ctrl-C
出去看看 运行 工人:
vagrant@vagrant-service:/etc/init$ sudo service all-my-workers status
all-my-workers start/running
vagrant@vagrant-service:/etc/init$ sudo service my-worker status N=1
my-worker (1) start/running, process 21938
在 ps
中:
worker 21938 0.0 0.1 4392 612 ? Ss 21:46 0:00 /bin/sh -e /proc/self/fd/9
worker 21941 0.2 7.3 174076 27616 ? Sl 21:46 0:00 python /var/lib/my-system/script/start_worker.py
我认为问题不在 my-worker.conf
但以防万一:
description "my-worker"
stop on stopping all-my-workers
setuid worker
setgid worker
respawn
instance $N
console log
env SCRIPT_PATH="/var/lib/my-system/script/"
script
export PROVIDER=vagrant
export REGION=all
export ENVIRONMENT=cert
. /var/lib/my-system/.virtualenvs/my-system/bin/activate
python $SCRIPT_PATH/start_worker.py
END
end script
非常感谢!
我该如何解决?
我假设 my-worker
是一个 long-lived 进程,并且您希望有任何简单的方法来启动和拆除 my-worker
的多个并行实例。
如果是这种情况,您可能不希望all-my-workers
成为task
。您需要以下内容:
description "all-my-workers"
start on runlevel [2345]
console log
env NUM_INSTANCES=1
env STARTING_PORT=42002
pre-start script
for i in `seq 1 $NUM_INSTANCES`;
do
start my-worker N=$i PORT=$(($STARTING_PORT + $i))
done
end script
pre-stop script
for i in `seq 1 $NUM_INSTANCES`;
do
stop my-worker N=$i PORT=$(($STARTING_PORT + $i)) || true
done
end script
然后您可以 运行 start all-my-workers
启动所有 my-worker
实例,然后 运行 stop all-my-workers
停止它们。实际上,all-my-workers
变成了一个 parent 作业来管理它的 child 作业的启动和停止。
为什么?
您引用了两个 SO 答案,展示了 parent 工作管理 child 工作的想法。他们显示:
- 一个 任务 有一个
script
节 - 一个 工作 有一个
pre-start
节
你的 parent 工作是一个 任务 有一个 pre-start
节,这就是你 运行 陷入这种奇怪行为的原因.
脚本 vs pre-start
来自this Ask Ubuntu answer which cites this deprecated documentation,有两个非常重要的陈述(强调):
All job files must have either an exec or script stanza. This specifies what will be run for the job.
Additional shell code can be given to be run before or after the binary or script specified with exec or script. These are not expected to start the process, in fact, they can't. They are intended for preparing the environment and cleaning up afterwards.
总而言之,由 pre-start
节产生的任何后台进程都被 Upstart 忽略(即不监视)。相反,您 必须 使用 exec
或 script
来生成 Upstart 将监视的进程。
如果省略 exec
/script
节会怎样?新贵将坐下来等待一个进程被产生。因此,您不妨编写一个 while-true 循环:
script
while true; do
true
done
end script
唯一的区别是 while-true 循环是 live-lock 而空节导致 dead-lock.
工作与任务
了解以上内容后,the Upstart documentation for tasks 终于将我们引向了正在发生的事情:
Without the 'task' keyword, the events that cause the job to start will be unblocked as soon as the job is started. This means the job has emitted a starting(7) event, run its pre-start, begun its script/exec, and post-start, and emitted its started(7) event.
With task, the events that lead to this job starting will be blocked until the job has completely transitioned back to stopped. This means that the job has run up to the previously mentioned started(7) event, and has also completed its post-stop, and emitted its stopped(7) event.
(如果您阅读有关 starting and stopping jobs 的文档,有关事件和状态的一些细节会更有意义)。
简单来说:
- 对于正常的 Upstart 作业,
exec
/script
节预计会无限期阻塞,因为它正在启动 long-lived 进程。因此,Upstart 在完成pre-start
节后停止阻塞。 - 对于
task
,exec
/script
节预计会阻塞 "finite" 时间段,因为它正在启动 short-lived 进程。因此,Ubstart 阻塞直到 afterexec
/script
节完成。
但是如果没有 exec
/script
节会怎样? Upstart 会无限期地等待某些东西的发布,但这 永远不会发生 。
- 在
job
的情况下,这很好,因为 Upstart 在等待进程生成时不会阻塞,并且调用stop
是 apparently 足以使它停止等待。 - 但是,在
task
的情况下,暴发户将永远坐着挂着——或者直到你打断它。但是,因为它仍然没有找到派生的进程,所以技术上它仍然是 运行ning。这就是为什么您可以在中断后查询状态并查看all-my-workers start/running
.
为了兴趣
如果出于某种原因,你真的想把你的 parent 工作变成一项任务,你实际上需要 两个 任务:一个用于启动 my-worker
个实例和一个来阻止它们。您还需要从 my-worker
.
stop on stopping all-my-workers
节
start-all-my-workers:
description "starts all-my-workers"
start on runlevel [2345]
task
console log
env NUM_INSTANCES=1
env STARTING_PORT=42002
script
for i in `seq 1 $NUM_INSTANCES`;
do
start my-worker N=$i PORT=$(($STARTING_PORT + $i))
done
end script
stop-all-my-workers:
description "stops all-my-workers"
start on runlevel [!2345]
task
console log
env NUM_INSTANCES=1
env STARTING_PORT=42002
script
for i in `seq 1 $NUM_INSTANCES`;
do
stop my-worker N=$i PORT=$(($STARTING_PORT + $i)) || true
done
end script