Django ORM 将相关对象弄乱 prefetch_related

Django ORM messes related objects with prefetch_related

我遇到了非常奇怪的 prefetch_related 调用行为。这是插图:

# First define two sketch models, just for convenience of the further talk.

class Secondary(models.Model):
    pass

class Primary(models.Model):
    secondaries = models.ManyToManyField(Secondary)

# Just to make clear, EVERY Primary object in my system has at least one
# related Secondary object.

# Now prepare a query.

primaries = Primary.objects.filter(...)\
                           .order_by(...)\
                           .prefetch_related('secondary')

# Iterating:

for primary in primaries:
    if not primary.secondaries.all():
        # So we have found an object that is said to not have
        # any relatives.  Re-query this particular object.
        # This part is hit in my code, although it should not.
        primary = Primary.objects.get(pk=primary.pk)
    for secondary in primary.secondaries.all():
        # Voila, there are relatives!
        # This part was not hit for some objects until I added
        # the re-query part above.
        pass

为了清楚起见,我的系统中没有 Primary 个对象没有相关的 Secondary 个对象,但上面的代码仍然命中了其中一​​些对象的重新查询部分(总是相同的),然后重新查询获取丢失的辅助节点。更奇怪的是,我可以看到一些 Primaries 在它们的 secondaries.all() 中得到了重复的 Secondary——总体印象是 ORM 错误地将一些 Secondary 集连接到错误的 Primaries。

有什么问题吗?那是 Django 的错误还是数据库的错误?

我使用 Django 1.10.5、psycopg2 2.7.3 和 Postgres 9.6。

更新: 我发现问题更严重:有时 ORM returns 相关对象的不完整列表,所以我上面解释的解决方法没有帮助.我们不得不删除 prefetch_related 调用,因为显然我们不能依赖它 returns.

的数据

更新 2: 正如丹尼尔在评论中所问,这里有一些真实的 SQL 查询(尽管不是来自我们遇到问题的系统)。 backend_build是"primary"模型,还有几个"secondary"模型:backend_buildproblembackend_sanityproblembackend_runproblem——我们用django_polymorphic 对他们来说,基础模型是backend_problem

Python 代码如下所示:

builds = Build.objects.filter(
    branch__active=True,
    type__active=True,
    finish_timestamp__gt=timezone.now() - timedelta(days=10))\
 .order_by('-finish_timestamp')\
 .prefetch_related('problems')

for build in builds:
  for problem in build.problems.all():
    print problem.id  # just a stub code to use results of the query.

下面是结果 SQL 查询:

SELECT "backend_build"."teamcity_id", "backend_build"."status", "backend_build"."finish_timestamp", "backend_build"."type_id", "backend_build"."branch_id", "backend_build"."revision"
  FROM "backend_build"
  INNER JOIN "backend_buildtype" ON ("backend_build"."type_id" = "backend_buildtype"."id")
  INNER JOIN "backend_branch" ON ("backend_build"."branch_id" = "backend_branch"."id")
  WHERE ("backend_build"."finish_timestamp" > \'2017-08-18T06:35:21.322000+00:00\'::timestamptz AND "backend_buildtype"."active" = true AND "backend_branch"."active" = true)
  ORDER BY "backend_build"."finish_timestamp" DESC 

SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary"
  FROM "backend_problem"
  INNER JOIN "backend_build_problems" ON ("backend_problem"."id" = "backend_build_problems"."problem_id")
  WHERE "backend_build_problems"."build_id" = 18984809

SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary", "backend_sanityproblem"."problem_ptr_id", "backend_sanityproblem"."code", "backend_sanityproblem"."latest_occurred"
  FROM "backend_sanityproblem"
  INNER JOIN "backend_problem" ON ("backend_sanityproblem"."problem_ptr_id" = "backend_problem"."id")
  WHERE "backend_sanityproblem"."problem_ptr_id" IN (9251, 9252, 9253, 9254, 9255, 9256, 9257, 9259, 9261, 9262, 9263, 9264, 9268, 9269, 9270, 9271, 9272, 9273, 9274, 9275, 9276, 9277, 9280, 9283, 9285, 9287, 9290, 9293, 9294, 9295, 9297, 9302, 9303, 9304, 9306, 9307, 9309, 9312, 9313, 9314, 9316, 9317, 9319, 9321, 9322, 9062, 9063, 9066, 9068, 9092, 9107, 9109, 9112, 9648, 9649, 9650, 9651, 9652, 9653, 9654, 9655, 9656, 9657, 9658, 9659, 9660, 9661, 9662, 9663, 9664, 9665, 9666, 9667, 9668, 9669, 9670, 9671, 9672, 9673, 9674, 9675, 9676, 9677, 9678, 9679, 9680, 9681, 9682, 9683, 9684, 9685, 9686, 9687, 9688, 9689, 9690, 9691, 9692, 9693, 9694)

SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary", "backend_sanityproblem"."problem_ptr_id", "backend_sanityproblem"."code", "backend_sanityproblem"."latest_occurred"
  FROM "backend_sanityproblem"
  INNER JOIN "backend_problem" ON ("backend_sanityproblem"."problem_ptr_id" = "backend_problem"."id")
  WHERE "backend_sanityproblem"."problem_ptr_id" IN (9344, 9345, 9488, 9489, 9508, 9509, 9510, 9511, 9512, 9513, 9399, 9401, 9402, 9403, 9426, 9436, 9572, 9573, 9574, 9575, 9330, 9337, 9338, 9339, 9340, 9341, 9342)

SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary"
  FROM "backend_problem"
  INNER JOIN "backend_build_problems" ON ("backend_problem"."id" = "backend_build_problems"."problem_id")
  WHERE "backend_build_problems"."build_id" = 18944441

SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary", "backend_buildproblem"."problem_ptr_id", "backend_buildproblem"."stage"
  FROM "backend_buildproblem"
  INNER JOIN "backend_problem" ON ("backend_buildproblem"."problem_ptr_id" = "backend_problem"."id")
  WHERE "backend_buildproblem"."problem_ptr_id" IN (9600)

SELECT "backend_problem"."id", "backend_problem"."polymorphic_ctype_id", "backend_problem"."generic_type", "backend_problem"."startrack_id", "backend_problem"."useful", "backend_problem"."status", "backend_problem"."summary"
  FROM "backend_problem"
  INNER JOIN "backend_build_problems" ON ("backend_problem"."id" = "backend_build_problems"."problem_id")
  WHERE "backend_build_problems"."build_id" = 18944330

类似的查询还有很多,这里省略。从上面可以清楚地看出,系统 ritst 查询主要模型,然后请求每个主要对象的关系,并且它考虑了它们的多态类型。

我怀疑您的问题是因为您有多个次要模型具有相同的基本模型。可能有一个内部缓存被每个查询覆盖。尝试将 prefetch_related 语句限制为 problems 模型:

.prefetch_related('problems')

或者可能与this issue with django-polymorphic有关?