Django 注释查询中错误的 GROUP BY 字段
Wrong GROUP BY field in Django annotate query
原问题由相当别扭的骑行模型引起参考:
# A -> B -> A
class A:
b = models.ForeignKey('B', null=True, blank=True)
class B:
a = models.ForeignKey('A')
现在,当我尝试注释查询时,它总是使用来自 LEFT OUTER JOIN 的 GROUP BY a 的 id(下面示例中的 T3.id)而不是a.id.
示例:
A.objects.select_related('b', 'b__a').annotate(reviews=Count('reviews'))
生成 SQL:
SELECT
`a`.`id`,
`b`.`id`,
T3.`id`,
FROM
`a`
LEFT OUTER JOIN
`b` ON (`a`.`b_id` = `b`.`id`)
LEFT OUTER JOIN
`a` T3 ON (`b`.`a_id` = T3.`id`)
WHERE
`a`.`id` IN (1, 2, 3, 4, 5)
GROUP BY T3.`id`
ORDER BY NULL;
我知道我可以做接下来的事情:
- 改变模型不做循环参考(遗憾的是现在不能这样做)
- 可以使用 .extra() 而不是注释(我会尽量避免)
- 删除 .select_related() 调用(由于性能问题无法执行)
UPD:使用 GROUP BY T3.id 将排除结果,其中 a.b == None
对我来说最好的解决方案就是在 GROUP BY 子句中指定正确的字段,但我不知道如何做。可能吗?还有其他方法可以解决这个问题吗?谢谢
打开 Django 编译器:
def collapse_group_by(self, expressions, having):
# If the DB can group by primary key, then group by the primary key of
# query's main model. Note that for PostgreSQL the GROUP BY clause must
# include the primary key of every table, but for MySQL it is enough to
# have the main table's primary key. Currently only the MySQL form is
# implemented.
# MySQLism: however, columns in HAVING clause must be added to the
# GROUP BY.
if self.connection.features.allows_group_by_pk:
# The logic here is: if the main model's primary key is in the
# query, then set new_expressions to that field. If that happens,
# then also add having expressions to group by.
pk = None
for expr in expressions:
if (expr.output_field.primary_key and
getattr(expr.output_field, 'model') == self.query.model):
pk = expr
# HERE BREAKPOINT REQUIRED
if pk:
expressions = [pk] + [expr for expr in expressions if expr in having]
return expressions
所以,collapse_group_by 函数不会停止寻找 pk,即使它已经找到,这就是为什么分组依据是由 T3.id 而不是 a.id 完成的(因此我丢失了结果)。
为了解决这个问题,for循环中需要断点(注释中标记)。
UPD:该问题已在 Django 1.8.2 版本中修复 https://code.djangoproject.com/ticket/24748
原问题由相当别扭的骑行模型引起参考:
# A -> B -> A
class A:
b = models.ForeignKey('B', null=True, blank=True)
class B:
a = models.ForeignKey('A')
现在,当我尝试注释查询时,它总是使用来自 LEFT OUTER JOIN 的 GROUP BY a 的 id(下面示例中的 T3.id)而不是a.id.
示例:
A.objects.select_related('b', 'b__a').annotate(reviews=Count('reviews'))
生成 SQL:
SELECT
`a`.`id`,
`b`.`id`,
T3.`id`,
FROM
`a`
LEFT OUTER JOIN
`b` ON (`a`.`b_id` = `b`.`id`)
LEFT OUTER JOIN
`a` T3 ON (`b`.`a_id` = T3.`id`)
WHERE
`a`.`id` IN (1, 2, 3, 4, 5)
GROUP BY T3.`id`
ORDER BY NULL;
我知道我可以做接下来的事情:
- 改变模型不做循环参考(遗憾的是现在不能这样做)
- 可以使用 .extra() 而不是注释(我会尽量避免)
- 删除 .select_related() 调用(由于性能问题无法执行)
UPD:使用 GROUP BY T3.id 将排除结果,其中 a.b == None
对我来说最好的解决方案就是在 GROUP BY 子句中指定正确的字段,但我不知道如何做。可能吗?还有其他方法可以解决这个问题吗?谢谢
打开 Django 编译器:
def collapse_group_by(self, expressions, having):
# If the DB can group by primary key, then group by the primary key of
# query's main model. Note that for PostgreSQL the GROUP BY clause must
# include the primary key of every table, but for MySQL it is enough to
# have the main table's primary key. Currently only the MySQL form is
# implemented.
# MySQLism: however, columns in HAVING clause must be added to the
# GROUP BY.
if self.connection.features.allows_group_by_pk:
# The logic here is: if the main model's primary key is in the
# query, then set new_expressions to that field. If that happens,
# then also add having expressions to group by.
pk = None
for expr in expressions:
if (expr.output_field.primary_key and
getattr(expr.output_field, 'model') == self.query.model):
pk = expr
# HERE BREAKPOINT REQUIRED
if pk:
expressions = [pk] + [expr for expr in expressions if expr in having]
return expressions
所以,collapse_group_by 函数不会停止寻找 pk,即使它已经找到,这就是为什么分组依据是由 T3.id 而不是 a.id 完成的(因此我丢失了结果)。 为了解决这个问题,for循环中需要断点(注释中标记)。
UPD:该问题已在 Django 1.8.2 版本中修复 https://code.djangoproject.com/ticket/24748