带有 Celery 的 Django - 找不到现有对象

Question

我在从另一个 celery 任务执行 celery 任务时遇到问题。

这是有问题的片段（数据对象已经存在于数据库中，它的属性只是在 finalize_data 函数中更新）：

def finalize_data(data):
    data = update_statistics(data)
    data.save()
    from apps.datas.tasks import optimize_data
    optimize_data.delay(data.pk)

@shared_task
def optimize_data(data_pk):
    data = Data.objects.get(pk=data_pk)
    #Do something with data

optimize_data 函数中的获取调用失败 "Data matching query does not exist."

如果我在 finalize_data 函数中调用 retrieve by pk 函数，它工作正常。如果我延迟 celery 任务调用一段时间，它也能正常工作。

这一行：

optimize_data.apply_async((data.pk,), countdown=10)

而不是

optimize_data.delay(data.pk)

工作正常。但我不想在我的代码中使用 hack。 .save() 调用是否可能异步阻止访问 row/object?

Answer 1

我猜您的调用者在 celery 开始处理任务之前尚未提交的事务中。因此芹菜找不到记录。这就是为什么添加倒计时使其起作用的原因。

1 秒倒计时可能与您示例中的 10 秒倒计时一样有效。我在整个代码中使用了 1 秒倒计时来处理这个问题。

另一个解决方案是停止使用事务。

Answer 2

我知道这是一个旧问题 post 但我今天偶然发现了这个问题。 Lee 的回答为我指出了正确的方向，但我认为今天有更好的解决方案。

使用 Django 提供的 on_commit 处理程序可以解决这个问题，而无需在代码中使用骇人听闻的倒计时方式，用户可能无法直观地了解它存在的原因。

我不确定这个问题在 posted 时是否存在，但我只是 post 给出答案，以便将来来这里的人知道替代方案。

Answer 3

您可以使用 on_commit 挂钩来确保 celery 任务在事务提交之后才被触发？

DjangoDocs#performing-actions-after-commit

这是 Django 1.9 中添加的功能。

from django.db import transaction

def do_something():
    pass  # send a mail, invalidate a cache, fire off a Celery task, etc.

transaction.on_commit(do_something)

您还可以将函数包装在 lambda 中：

transaction.on_commit(lambda: some_celery_task.delay('arg1'))

The function you pass in will be called immediately after a hypothetical database write made where on_commit() is called would be successfully committed.

If you call on_commit() while there isn’t an active transaction, the callback will be executed immediately.

If that hypothetical database write is instead rolled back (typically when an unhandled exception is raised in an atomic() block), your function will be discarded and never called.

带有 Celery 的 Django - 找不到现有对象

Django with Celery - existing object not found

python

django

asynchronous

celery