使用 prefetch_related 和聚合来避免 Django 数据库查询具有时间序列数据的模型的 n+1 问题

Question

我试图避免在 Django 应用程序中进行大量的数据库查询。在应用程序中，我正在监控一些可以投票（模型：投票）的建议（模型：建议）。

投票模型不存储每个单独的投票。相反，建议的总票数会定期存储。 “更好的冰淇淋”的建议可能有“8:10 10 票”、“8:20 12 票”、“8:30 25 票”等

我用一些主要的 n+1 问题创建了一个非常低效的循环来计算每个建议每天的新投票数。

我正在寻找一个比当前查询集更高效（可能是单个）的查询集来实现相同的功能。我知道我可能应该根据 views.py 中“建议”的投票日期创建某种注释，然后通过计算每天投票数的聚合函数对其进行注释，但我不知道如何实际上将其链接在一起。

这是我目前工作但效率很低的代码：

models.py:

class Suggestion(models.Model):
    unique_id = models.CharField(max_length=10, unique=True)
    title = models.CharField(max_length=500)
    suggested_date = models.DateField()

class Vote(models.Model):
    suggestion = models.ForeignKey('Suggestion', on_delete=models.CASCADE)
    timestamp = models.DateTimeField()
    votes = models.IntegerField()

views.py:

def index(request):
    # Proces votes per day per suggestion
    suggestions = Suggestion.objects.prefetch_related('vote_set')
    votes_per_day_per_suggestion = {}
    for suggestion in suggestions:
        votes_per_day_per_suggestion[suggestion.title] = {}
        votes = suggestion.vote_set
        suggestion_dates = votes.dates('timestamp', 'day') # n+1 issue
        for date in suggestion_dates:
            date_min_max = votes.filter(timestamp__date=date).aggregate(votes_on_date=(Max('votes') - Min('votes'))) # n+1 issue
            votes_per_day_per_suggestion[suggestion.title][date] = date_min_max['votes_on_date']
    context['votes_per_day_per_suggestion'] = votes_per_day_per_suggestion
    return render(request, 'borgerforslag/index.html', context)

模板输出：

Better toilet paper (number of votes per day):
19. october 2021: 23
20. october 2021: 19
21. october 2021: 18
22. october 2021: 9
23. october 2021: 25
24. october 2021: 34
25. october 2021: 216

Answer 1

您只需要 values()、annotate() 和 order_by() 即可获得每个建议每天的投票数。这在这里应该可以工作

Vote.objects.all() \
    .values('timestamp__date', 'suggestion') \
    .annotate(num_votes=Count('votes') \
    .order_by('timestamp__date')

尽管如此，您的输出示例不是每个建议每天的投票数，而且似乎是每天的投票数。这可以通过像这样从查询中删除建议来实现：

Vote.objects.all() \
    .values('timestamp__date') \
    .annotate(num_votes=Count('votes') \
    .order_by('timestamp__date')

Answer 2

以下应该会为您提供值查询集中的所有建议、日期和票数

from django.db.models import Max, Min
from django.db.models.functions import TruncDate


def index(request):
    suggestions = Suggestion.objects.annotate(
        date=TruncDate('vote__timestamp')
    ).order_by(
        'id', 'date'
    ).annotate(
        sum=Max('vote__votes') - Min('vote__votes')
    )
    return render(request, 'borgerforslag/index.html', {'suggestions': suggestions})

然后在模板中使用regroup根据建议

对所有结果进行分组

{% regroup suggestions by title as suggestions_grouped %}

<ul>
{% for suggestion in suggestions_grouped %}
    <li>{{ suggestion.grouper }}
    <ul>
        {% for date in suggestion.list %}
          <li>{{ date.date }}: {{ date.sum }}</li>
        {% endfor %}
    </ul>
    </li>
{% endfor %}
</ul>

使用 prefetch_related 和聚合来避免 Django 数据库查询具有时间序列数据的模型的 n+1 问题

Using prefetch_related and aggregations to avoid n+1 issue with Django database queries for model with time series data

python

django

django-queryset

aggregation