迭代更新它们的 django 对象的最有效方法是什么？

Question

所以我有一个查询集要更新

stories = Story.objects.filter(introtext="")
for story in stories:
    #just set it to the first 'sentence'
    story.introtext = story.content[0:(story.content.find('.'))] + ".</p>" 
    story.save()

并且 save() 操作完全破坏了性能。在进程列表中，有多个条目“./manage.py shell”是的，我运行通过 django shell.

然而，在过去，我有运行不需要使用 save() 的脚本，因为它正在更改多对多字段。这些脚本非常高效。我的项目有这段代码，这可能与这些脚本如此优秀的原因有关。

@receiver(signals.m2m_changed, sender=Story.tags.through)
def save_story(sender, instance, action, reverse, model, pk_set, **kwargs):
    instance.save()

高效更新大型查询集 (10000+) 的最佳方法是什么？

Answer 1

您可以在 queryset

上使用 update 内置函数

示例：

MyModel.objects.all().update(color=red)

在您的情况下，您需要使用 F()（阅读更多 here）内置函数来使用实例自身的属性：

from django.db.models import F

stories = Story.objects.filter(introtext__exact='')
stories.update(F('introtext')[0:F('content').find(.)] + ".</p>" )

Answer 2

至于新的 introtext 值取决于对象的 content 字段，您无法进行任何批量更新。但是您可以通过将其包装到事务中来加快单个对象的保存列表：

from django.db import transaction

with  transaction.atomic():
    stories = Story.objects.filter(introtext='')
    for story in stories:
        introtext = story.content[0:(story.content.find('.'))] + ".</p>" 
        Story.objects.filter(pk=story.pk).update(introtext=introtext)

transaction.atomic() 将按数量级提高速度。

filter(pk=story.pk).update() 技巧允许您阻止在简单的 save() 情况下发出的任何 pre_save/post_save 信号。这是更新对象的单个字段officially recommended method。

迭代更新它们的 django 对象的最有效方法是什么？

What is the most efficient way to iterate django objects updating them?

python

django

django-database