芹菜:守护进程不允许有 children
celery: daemonic processes are not allowed to have children
在 Python (2.7) 中,我尝试在 celery 任务 (celery 3.1.17) 中创建进程(使用多处理),但它给出了错误:
daemonic processes are not allowed to have children
谷歌搜索,我发现最新版本的台球修复了 "bug" 但我有最新版本 (3.3.0.20),但错误仍在发生。我也尝试在我的 celery 任务中实现 this workaround 但它给出了同样的错误。
有人知道怎么做吗?
任何帮助表示赞赏,
帕特里克
编辑:代码片段
任务:
from __future__ import absolute_import
from celery import shared_task
from embedder.models import Embedder
@shared_task
def embedder_update_task(embedder_id):
embedder = Embedder.objects.get(pk=embedder_id)
embedder.test()
人工测试函数(from here):
def sleepawhile(t):
print("Sleeping %i seconds..." % t)
time.sleep(t)
return t
def work(num_procs):
print("Creating %i (daemon) workers and jobs in child." % num_procs)
pool = mp.Pool(num_procs)
result = pool.map(sleepawhile,
[randint(1, 5) for x in range(num_procs)])
# The following is not really needed, since the (daemon) workers of the
# child's pool are killed when the child is terminated, but it's good
# practice to cleanup after ourselves anyway.
pool.close()
pool.join()
return result
def test(self):
print("Creating 5 (non-daemon) workers and jobs in main process.")
pool = MyPool(5)
result = pool.map(work, [randint(1, 5) for x in range(5)])
pool.close()
pool.join()
print(result)
我的真实函数:
import mulitprocessing as mp
def test(self):
self.init()
for saveindex in range(self.start_index,self.start_index+self.nsaves):
self.create_storage(saveindex)
# process creation:
procs = [mp.Process(name="Process-"+str(i),target=getattr(self,self.training_method),args=(saveindex,)) for i in range(self.nproc)]
for p in procs: p.start()
for p in procs: p.join()
print "End of task"
init 函数定义了一个多处理数组和一个共享同一内存的 object,以便我的所有进程可以同时更新同一数组:
mp_arr = mp.Array(c.c_double, np.random.rand(1000000)) # example
self.V = numpy.frombuffer(mp_arr.get_obj()) #all the processes can update V
调用任务时产生的错误:
[2015-06-04 09:47:46,659: INFO/MainProcess] Received task: embedder.tasks.embedder_update_task[09f8abae-649a-4abc-8381-bdf258d33dda]
[2015-06-04 09:47:47,674: WARNING/Worker-5] Creating 5 (non-daemon) workers and jobs in main process.
[2015-06-04 09:47:47,789: ERROR/MainProcess] Task embedder.tasks.embedder_update_task[09f8abae-649a-4abc-8381-bdf258d33dda] raised unexpected: AssertionError('daemonic processes are not allowed to have children',)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/celery/app/trace.py", line 438, in __protected_call__
return self.run(*args, **kwargs)
File "/home/patrick/django/entite-tracker-master/entitetracker/embedder/tasks.py", line 21, in embedder_update_task
embedder.test()
File "/home/patrick/django/entite-tracker-master/entitetracker/embedder/models.py", line 475, in test
pool = MyPool(5)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 159, in __init__
self._repopulate_pool()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 223, in _repopulate_pool
w.start()
File "/usr/lib/python2.7/multiprocessing/process.py", line 124, in start
'daemonic processes are not allowed to have children'
AssertionError: daemonic processes are not allowed to have children
billiard
和 multiprocessing
是不同的库 - billiard
是 Celery 项目自己的 multiprocessing
分支。您将需要导入 billiard
并使用它代替 multiprocessing
然而,更好的答案可能是您应该重构您的代码,以便生成更多的 Celery 任务,而不是使用两种不同的方式来分配您的工作。
您可以使用 Celery canvas
from celery import group
@app.task
def sleepawhile(t):
print("Sleeping %i seconds..." % t)
time.sleep(t)
return t
def work(num_procs):
return group(sleepawhile.s(randint(1, 5)) for x in range(num_procs)])
def test(self):
my_group = group(work(randint(1, 5)) for x in range(5))
result = my_group.apply_async()
result.get()
我试图制作一个使用 canvas 基元而不是多处理的代码的工作版本。但是,由于您的示例非常人为,因此想出一些有意义的东西并不容易。
更新:
这是使用 Celery 的真实代码的翻译 canvas:
tasks.py
:
@shared_task
run_training_method(saveindex, embedder_id):
embedder = Embedder.objects.get(pk=embedder_id)
embedder.training_method(saveindex)
models.py
:
from tasks import run_training_method
from celery import group
class Embedder(Model):
def embedder_update_task(self):
my_group = []
for saveindex in range(self.start_index, self.start_index + self.nsaves):
self.create_storage(saveindex)
# Add to list
my_group.extend([run_training_method.subtask((saveindex, self.id))
for i in range(self.nproc)])
result = group(my_group).apply_async()
我在 Celery 4.2.0 和 Python3.6 中使用多处理时得到了这个。
使用台球解决了这个问题。
我从
更改了我的源代码
from multiprocessing import Process
到
from billiard.context import Process
解决了这个错误。
注意,导入源是 billiard.context
而不是 billiard.process
我在尝试从 Django 中的 Celery 任务调用多处理方法时遇到了类似的错误。我解决了使用台球而不是 multiprocessing
import billiard as multiprocessing
希望对您有所帮助。
如果您使用的 submodule/library 已经内置了多处理功能,那么设置 worker 的 -P threads
参数可能更有意义:
celery worker -P threads
https://github.com/celery/celery/issues/4525#issuecomment-566503932
更新:在 celery < v5.1.1
中的命令行解析中存在一个错误,即使它受到支持,也不允许 -P threads
。它固定在>=v5.1.1
。自 v4.4
.
起正式支持
在 Python (2.7) 中,我尝试在 celery 任务 (celery 3.1.17) 中创建进程(使用多处理),但它给出了错误:
daemonic processes are not allowed to have children
谷歌搜索,我发现最新版本的台球修复了 "bug" 但我有最新版本 (3.3.0.20),但错误仍在发生。我也尝试在我的 celery 任务中实现 this workaround 但它给出了同样的错误。
有人知道怎么做吗? 任何帮助表示赞赏, 帕特里克
编辑:代码片段
任务:
from __future__ import absolute_import
from celery import shared_task
from embedder.models import Embedder
@shared_task
def embedder_update_task(embedder_id):
embedder = Embedder.objects.get(pk=embedder_id)
embedder.test()
人工测试函数(from here):
def sleepawhile(t):
print("Sleeping %i seconds..." % t)
time.sleep(t)
return t
def work(num_procs):
print("Creating %i (daemon) workers and jobs in child." % num_procs)
pool = mp.Pool(num_procs)
result = pool.map(sleepawhile,
[randint(1, 5) for x in range(num_procs)])
# The following is not really needed, since the (daemon) workers of the
# child's pool are killed when the child is terminated, but it's good
# practice to cleanup after ourselves anyway.
pool.close()
pool.join()
return result
def test(self):
print("Creating 5 (non-daemon) workers and jobs in main process.")
pool = MyPool(5)
result = pool.map(work, [randint(1, 5) for x in range(5)])
pool.close()
pool.join()
print(result)
我的真实函数:
import mulitprocessing as mp
def test(self):
self.init()
for saveindex in range(self.start_index,self.start_index+self.nsaves):
self.create_storage(saveindex)
# process creation:
procs = [mp.Process(name="Process-"+str(i),target=getattr(self,self.training_method),args=(saveindex,)) for i in range(self.nproc)]
for p in procs: p.start()
for p in procs: p.join()
print "End of task"
init 函数定义了一个多处理数组和一个共享同一内存的 object,以便我的所有进程可以同时更新同一数组:
mp_arr = mp.Array(c.c_double, np.random.rand(1000000)) # example
self.V = numpy.frombuffer(mp_arr.get_obj()) #all the processes can update V
调用任务时产生的错误:
[2015-06-04 09:47:46,659: INFO/MainProcess] Received task: embedder.tasks.embedder_update_task[09f8abae-649a-4abc-8381-bdf258d33dda]
[2015-06-04 09:47:47,674: WARNING/Worker-5] Creating 5 (non-daemon) workers and jobs in main process.
[2015-06-04 09:47:47,789: ERROR/MainProcess] Task embedder.tasks.embedder_update_task[09f8abae-649a-4abc-8381-bdf258d33dda] raised unexpected: AssertionError('daemonic processes are not allowed to have children',)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/celery/app/trace.py", line 438, in __protected_call__
return self.run(*args, **kwargs)
File "/home/patrick/django/entite-tracker-master/entitetracker/embedder/tasks.py", line 21, in embedder_update_task
embedder.test()
File "/home/patrick/django/entite-tracker-master/entitetracker/embedder/models.py", line 475, in test
pool = MyPool(5)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 159, in __init__
self._repopulate_pool()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 223, in _repopulate_pool
w.start()
File "/usr/lib/python2.7/multiprocessing/process.py", line 124, in start
'daemonic processes are not allowed to have children'
AssertionError: daemonic processes are not allowed to have children
billiard
和 multiprocessing
是不同的库 - billiard
是 Celery 项目自己的 multiprocessing
分支。您将需要导入 billiard
并使用它代替 multiprocessing
然而,更好的答案可能是您应该重构您的代码,以便生成更多的 Celery 任务,而不是使用两种不同的方式来分配您的工作。
您可以使用 Celery canvas
from celery import group
@app.task
def sleepawhile(t):
print("Sleeping %i seconds..." % t)
time.sleep(t)
return t
def work(num_procs):
return group(sleepawhile.s(randint(1, 5)) for x in range(num_procs)])
def test(self):
my_group = group(work(randint(1, 5)) for x in range(5))
result = my_group.apply_async()
result.get()
我试图制作一个使用 canvas 基元而不是多处理的代码的工作版本。但是,由于您的示例非常人为,因此想出一些有意义的东西并不容易。
更新:
这是使用 Celery 的真实代码的翻译 canvas:
tasks.py
:
@shared_task
run_training_method(saveindex, embedder_id):
embedder = Embedder.objects.get(pk=embedder_id)
embedder.training_method(saveindex)
models.py
:
from tasks import run_training_method
from celery import group
class Embedder(Model):
def embedder_update_task(self):
my_group = []
for saveindex in range(self.start_index, self.start_index + self.nsaves):
self.create_storage(saveindex)
# Add to list
my_group.extend([run_training_method.subtask((saveindex, self.id))
for i in range(self.nproc)])
result = group(my_group).apply_async()
我在 Celery 4.2.0 和 Python3.6 中使用多处理时得到了这个。 使用台球解决了这个问题。
我从
更改了我的源代码from multiprocessing import Process
到
from billiard.context import Process
解决了这个错误。
注意,导入源是 billiard.context
而不是 billiard.process
我在尝试从 Django 中的 Celery 任务调用多处理方法时遇到了类似的错误。我解决了使用台球而不是 multiprocessing
import billiard as multiprocessing
希望对您有所帮助。
如果您使用的 submodule/library 已经内置了多处理功能,那么设置 worker 的 -P threads
参数可能更有意义:
celery worker -P threads
https://github.com/celery/celery/issues/4525#issuecomment-566503932
更新:在 celery < v5.1.1
中的命令行解析中存在一个错误,即使它受到支持,也不允许 -P threads
。它固定在>=v5.1.1
。自 v4.4
.