Django ORM 将相关对象弄乱 prefetch_related
Django ORM messes related objects with prefetch_related
我遇到了非常奇怪的 prefetch_related
调用行为。这是插图:
# First define two sketch models, just for convenience of the further talk.
class Secondary(models.Model):
pass
class Primary(models.Model):
secondaries = models.ManyToManyField(Secondary)
# Just to make clear, EVERY Primary object in my system has at least one
# related Secondary object.
# Now prepare a query.
primaries = Primary.objects.filter(...)\
.order_by(...)\
.prefetch_related('secondary')
# Iterating:
for primary in primaries:
if not primary.secondaries.all():
# So we have found an object that is said to not have
# any relatives. Re-query this particular object.
# This part is hit in my code, although it should not.
primary = Primary.objects.get(pk=primary.pk)
for secondary in primary.secondaries.all():
# Voila, there are relatives!
# This part was not hit for some objects until I added
# the re-query part above.
pass
为了清楚起见,我的系统中没有 Primary
个对象没有相关的 Secondary
个对象,但上面的代码仍然命中了其中一些对象的重新查询部分(总是相同的),然后重新查询获取丢失的辅助节点。更奇怪的是,我可以看到一些 Primaries 在它们的 secondaries.all()
中得到了重复的 Secondary——总体印象是 ORM 错误地将一些 Secondary 集连接到错误的 Primaries。
有什么问题吗?那是 Django 的错误还是数据库的错误?
我使用 Django 1.10.5、psycopg2 2.7.3 和 Postgres 9.6。
更新: 我发现问题更严重:有时 ORM returns 相关对象的不完整列表,所以我上面解释的解决方法没有帮助.我们不得不删除 prefetch_related 调用,因为显然我们不能依赖它 returns.
的数据
更新 2: 正如丹尼尔在评论中所问,这里有一些真实的 SQL 查询(尽管不是来自我们遇到问题的系统)。 backend_build
是"primary"模型,还有几个"secondary"模型:backend_buildproblem
、backend_sanityproblem
、backend_runproblem
——我们用django_polymorphic 对他们来说,基础模型是backend_problem
。
Python 代码如下所示:
builds = Build.objects.filter(
branch__active=True,
type__active=True,
finish_timestamp__gt=timezone.now() - timedelta(days=10))\
.order_by('-finish_timestamp')\
.prefetch_related('problems')
for build in builds:
for problem in build.problems.all():
print problem.id # just a stub code to use results of the query.
下面是结果 SQL 查询:
SELECT "backend_build"."teamcity_id", "backend_build"."status", "backend_build"."finish_timestamp", "backend_build"."type_id", "backend_build"."branch_id", "backend_build"."revision"
FROM "backend_build"
INNER JOIN "backend_buildtype" ON ("backend_build"."type_id" = "backend_buildtype"."id")
INNER JOIN "backend_branch" ON ("backend_build"."branch_id" = "backend_branch"."id")
WHERE ("backend_build"."finish_timestamp" > \'2017-08-18T06:35:21.322000+00:00\'::timestamptz AND "backend_buildtype"."active" = true AND "backend_branch"."active" = true)
ORDER BY "backend_build"."finish_timestamp" DESC
SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary"
FROM "backend_problem"
INNER JOIN "backend_build_problems" ON ("backend_problem"."id" = "backend_build_problems"."problem_id")
WHERE "backend_build_problems"."build_id" = 18984809
SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary", "backend_sanityproblem"."problem_ptr_id", "backend_sanityproblem"."code", "backend_sanityproblem"."latest_occurred"
FROM "backend_sanityproblem"
INNER JOIN "backend_problem" ON ("backend_sanityproblem"."problem_ptr_id" = "backend_problem"."id")
WHERE "backend_sanityproblem"."problem_ptr_id" IN (9251, 9252, 9253, 9254, 9255, 9256, 9257, 9259, 9261, 9262, 9263, 9264, 9268, 9269, 9270, 9271, 9272, 9273, 9274, 9275, 9276, 9277, 9280, 9283, 9285, 9287, 9290, 9293, 9294, 9295, 9297, 9302, 9303, 9304, 9306, 9307, 9309, 9312, 9313, 9314, 9316, 9317, 9319, 9321, 9322, 9062, 9063, 9066, 9068, 9092, 9107, 9109, 9112, 9648, 9649, 9650, 9651, 9652, 9653, 9654, 9655, 9656, 9657, 9658, 9659, 9660, 9661, 9662, 9663, 9664, 9665, 9666, 9667, 9668, 9669, 9670, 9671, 9672, 9673, 9674, 9675, 9676, 9677, 9678, 9679, 9680, 9681, 9682, 9683, 9684, 9685, 9686, 9687, 9688, 9689, 9690, 9691, 9692, 9693, 9694)
SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary", "backend_sanityproblem"."problem_ptr_id", "backend_sanityproblem"."code", "backend_sanityproblem"."latest_occurred"
FROM "backend_sanityproblem"
INNER JOIN "backend_problem" ON ("backend_sanityproblem"."problem_ptr_id" = "backend_problem"."id")
WHERE "backend_sanityproblem"."problem_ptr_id" IN (9344, 9345, 9488, 9489, 9508, 9509, 9510, 9511, 9512, 9513, 9399, 9401, 9402, 9403, 9426, 9436, 9572, 9573, 9574, 9575, 9330, 9337, 9338, 9339, 9340, 9341, 9342)
SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary"
FROM "backend_problem"
INNER JOIN "backend_build_problems" ON ("backend_problem"."id" = "backend_build_problems"."problem_id")
WHERE "backend_build_problems"."build_id" = 18944441
SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary", "backend_buildproblem"."problem_ptr_id", "backend_buildproblem"."stage"
FROM "backend_buildproblem"
INNER JOIN "backend_problem" ON ("backend_buildproblem"."problem_ptr_id" = "backend_problem"."id")
WHERE "backend_buildproblem"."problem_ptr_id" IN (9600)
SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary"
FROM "backend_problem"
INNER JOIN "backend_build_problems" ON ("backend_problem"."id" = "backend_build_problems"."problem_id")
WHERE "backend_build_problems"."build_id" = 18944330
类似的查询还有很多,这里省略。从上面可以清楚地看出,系统 ritst 查询主要模型,然后请求每个主要对象的关系,并且它考虑了它们的多态类型。
我怀疑您的问题是因为您有多个次要模型具有相同的基本模型。可能有一个内部缓存被每个查询覆盖。尝试将 prefetch_related
语句限制为 problems
模型:
.prefetch_related('problems')
我遇到了非常奇怪的 prefetch_related
调用行为。这是插图:
# First define two sketch models, just for convenience of the further talk.
class Secondary(models.Model):
pass
class Primary(models.Model):
secondaries = models.ManyToManyField(Secondary)
# Just to make clear, EVERY Primary object in my system has at least one
# related Secondary object.
# Now prepare a query.
primaries = Primary.objects.filter(...)\
.order_by(...)\
.prefetch_related('secondary')
# Iterating:
for primary in primaries:
if not primary.secondaries.all():
# So we have found an object that is said to not have
# any relatives. Re-query this particular object.
# This part is hit in my code, although it should not.
primary = Primary.objects.get(pk=primary.pk)
for secondary in primary.secondaries.all():
# Voila, there are relatives!
# This part was not hit for some objects until I added
# the re-query part above.
pass
为了清楚起见,我的系统中没有 Primary
个对象没有相关的 Secondary
个对象,但上面的代码仍然命中了其中一些对象的重新查询部分(总是相同的),然后重新查询获取丢失的辅助节点。更奇怪的是,我可以看到一些 Primaries 在它们的 secondaries.all()
中得到了重复的 Secondary——总体印象是 ORM 错误地将一些 Secondary 集连接到错误的 Primaries。
有什么问题吗?那是 Django 的错误还是数据库的错误?
我使用 Django 1.10.5、psycopg2 2.7.3 和 Postgres 9.6。
更新: 我发现问题更严重:有时 ORM returns 相关对象的不完整列表,所以我上面解释的解决方法没有帮助.我们不得不删除 prefetch_related 调用,因为显然我们不能依赖它 returns.
的数据更新 2: 正如丹尼尔在评论中所问,这里有一些真实的 SQL 查询(尽管不是来自我们遇到问题的系统)。 backend_build
是"primary"模型,还有几个"secondary"模型:backend_buildproblem
、backend_sanityproblem
、backend_runproblem
——我们用django_polymorphic 对他们来说,基础模型是backend_problem
。
Python 代码如下所示:
builds = Build.objects.filter(
branch__active=True,
type__active=True,
finish_timestamp__gt=timezone.now() - timedelta(days=10))\
.order_by('-finish_timestamp')\
.prefetch_related('problems')
for build in builds:
for problem in build.problems.all():
print problem.id # just a stub code to use results of the query.
下面是结果 SQL 查询:
SELECT "backend_build"."teamcity_id", "backend_build"."status", "backend_build"."finish_timestamp", "backend_build"."type_id", "backend_build"."branch_id", "backend_build"."revision"
FROM "backend_build"
INNER JOIN "backend_buildtype" ON ("backend_build"."type_id" = "backend_buildtype"."id")
INNER JOIN "backend_branch" ON ("backend_build"."branch_id" = "backend_branch"."id")
WHERE ("backend_build"."finish_timestamp" > \'2017-08-18T06:35:21.322000+00:00\'::timestamptz AND "backend_buildtype"."active" = true AND "backend_branch"."active" = true)
ORDER BY "backend_build"."finish_timestamp" DESC
SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary"
FROM "backend_problem"
INNER JOIN "backend_build_problems" ON ("backend_problem"."id" = "backend_build_problems"."problem_id")
WHERE "backend_build_problems"."build_id" = 18984809
SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary", "backend_sanityproblem"."problem_ptr_id", "backend_sanityproblem"."code", "backend_sanityproblem"."latest_occurred"
FROM "backend_sanityproblem"
INNER JOIN "backend_problem" ON ("backend_sanityproblem"."problem_ptr_id" = "backend_problem"."id")
WHERE "backend_sanityproblem"."problem_ptr_id" IN (9251, 9252, 9253, 9254, 9255, 9256, 9257, 9259, 9261, 9262, 9263, 9264, 9268, 9269, 9270, 9271, 9272, 9273, 9274, 9275, 9276, 9277, 9280, 9283, 9285, 9287, 9290, 9293, 9294, 9295, 9297, 9302, 9303, 9304, 9306, 9307, 9309, 9312, 9313, 9314, 9316, 9317, 9319, 9321, 9322, 9062, 9063, 9066, 9068, 9092, 9107, 9109, 9112, 9648, 9649, 9650, 9651, 9652, 9653, 9654, 9655, 9656, 9657, 9658, 9659, 9660, 9661, 9662, 9663, 9664, 9665, 9666, 9667, 9668, 9669, 9670, 9671, 9672, 9673, 9674, 9675, 9676, 9677, 9678, 9679, 9680, 9681, 9682, 9683, 9684, 9685, 9686, 9687, 9688, 9689, 9690, 9691, 9692, 9693, 9694)
SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary", "backend_sanityproblem"."problem_ptr_id", "backend_sanityproblem"."code", "backend_sanityproblem"."latest_occurred"
FROM "backend_sanityproblem"
INNER JOIN "backend_problem" ON ("backend_sanityproblem"."problem_ptr_id" = "backend_problem"."id")
WHERE "backend_sanityproblem"."problem_ptr_id" IN (9344, 9345, 9488, 9489, 9508, 9509, 9510, 9511, 9512, 9513, 9399, 9401, 9402, 9403, 9426, 9436, 9572, 9573, 9574, 9575, 9330, 9337, 9338, 9339, 9340, 9341, 9342)
SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary"
FROM "backend_problem"
INNER JOIN "backend_build_problems" ON ("backend_problem"."id" = "backend_build_problems"."problem_id")
WHERE "backend_build_problems"."build_id" = 18944441
SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary", "backend_buildproblem"."problem_ptr_id", "backend_buildproblem"."stage"
FROM "backend_buildproblem"
INNER JOIN "backend_problem" ON ("backend_buildproblem"."problem_ptr_id" = "backend_problem"."id")
WHERE "backend_buildproblem"."problem_ptr_id" IN (9600)
SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary"
FROM "backend_problem"
INNER JOIN "backend_build_problems" ON ("backend_problem"."id" = "backend_build_problems"."problem_id")
WHERE "backend_build_problems"."build_id" = 18944330
类似的查询还有很多,这里省略。从上面可以清楚地看出,系统 ritst 查询主要模型,然后请求每个主要对象的关系,并且它考虑了它们的多态类型。
我怀疑您的问题是因为您有多个次要模型具有相同的基本模型。可能有一个内部缓存被每个查询覆盖。尝试将 prefetch_related
语句限制为 problems
模型:
.prefetch_related('problems')