Django 自定义查询集 class 切片不起作用
Django custom queryset class slicing not working
我们正在使用自定义对象 (pas1_objects)。一旦我们执行了一些过滤操作,我们就会得到下面的查询集。
all_users_queryset = (
User.pas1_objects.select_related("userlogindetails", "usermeta", "tenant").filter(data_filter)
.exclude(
userrolepermissions__role_id__in=["PG", "PD"], tenant_id__in=tenant_ids
)
.order_by(*sort_array)
)
上面查询的输出是-
<KafkaPushQuerySet [<User: User object (bbc306a1-3a75-4e5e-a9b4-6dce0b31b4b3)>, <User: User object (8830ecc1-944a-43e3-a701-331122a26c2b)>, <User: User object (25255aa5-31f9-45d4-9158-5225fba24cb3)>, <User: User object (4a24f883-210b-43a6-9b57-f51fac624a09)>, <User: User object (b6b044c6-cf9b-46f9-91d9-3a31cd4f6d30)>]>
对该查询集的切片不一致。
(Pdb) all_users_queryset[0:1] # first slice is giving correct output
<KafkaPushQuerySet [<User: User object (bbc306a1-3a75-4e5e-a9b4-6dce0b31b4b3)>]>
(Pdb) all_users_queryset[1:2] # this should return 2nd obj in the query set but it return 1st object
<KafkaPushQuerySet [<User: User object (bbc306a1-3a75-4e5e-a9b4-6dce0b31b4b3)>]>
(Pdb) all_users_queryset[2:3] # this works fine
<KafkaPushQuerySet [<User: User object (25255aa5-31f9-45d4-9158-5225fba24cb3)>]>
(Pdb) all_users_queryset[0:2] # this is returning 1st two objects but in reversed order.
<KafkaPushQuerySet [<User: User object (8830ecc1-944a-43e3-a701-331122a26c2b)>, <User: User object (bbc306a1-3a75-4e5e-a9b4-6dce0b31b4b3)>]>
(Pdb) all_users_queryset[0:3] # this is returning correct objets and in correct order.
<KafkaPushQuerySet [<User: User object (bbc306a1-3a75-4e5e-a9b4-6dce0b31b4b3)>, <User: User object (8830ecc1-944a-43e3-a701-331122a26c2b)>, <User: User object (25255aa5-31f9-45d4-9158-5225fba24cb3)>]>
导致这种不一致的原因是什么?
context on this pas1_object - 这基本上是一个通过继承 models.Manager 创建的自定义对象。并且我们覆盖了 get_queryset,自定义查询集下面 return -
class KafkaPushQuerySet(models.query.QuerySet):
"""
Overriding queryset function to push to kafka
"""
def delete(self):
self.produce_messages_to_kafka(action="delete")
super(KafkaPushQuerySet, self).delete()
def update(self, **kwargs):
super(KafkaPushQuerySet, self).update(**kwargs)
self.produce_messages_to_kafka(action="update")
谢谢!!!
仅当您对查询集进行切片时才会对其进行评估。换句话说,每次执行切片操作时都会访问数据库(关于这一点,请参见Django documentation)。
那么,问题来了:为什么你的order_by
方法每次评估的效果不一样?根据您的 User
数据库 table 和您的变量 sort_array
.
的内容,可能有几种(非排他性的)解释
- 数据库已在查询集的 2 次评估之间更新,
*sort_array
对应的元组不长,足够精确,顺序固定且唯一。来自 documentation:
A particular ordering is guaranteed only when ordering by a set of fields that uniquely identify each object in the results. For example, if a name
field isn’t unique, ordering by it won’t guarantee objects with the same name always appear in the same order.
一个简单的解决方案是在执行切片之前强制评估查询集,方法是调用 list()
:
my_list = list(User.pas1_objects.select_related("userlogindetails", "usermeta", "tenant")
.filter(data_filter)
.exclude(userrolepermissions__role_id__in=["PG", "PD"], tenant_id__in=tenant_ids)
.order_by(*sort_array)
)
# And then you can slice your list without any risk of inconsistency
my_list[0:1]
我们正在使用自定义对象 (pas1_objects)。一旦我们执行了一些过滤操作,我们就会得到下面的查询集。
all_users_queryset = (
User.pas1_objects.select_related("userlogindetails", "usermeta", "tenant").filter(data_filter)
.exclude(
userrolepermissions__role_id__in=["PG", "PD"], tenant_id__in=tenant_ids
)
.order_by(*sort_array)
)
上面查询的输出是-
<KafkaPushQuerySet [<User: User object (bbc306a1-3a75-4e5e-a9b4-6dce0b31b4b3)>, <User: User object (8830ecc1-944a-43e3-a701-331122a26c2b)>, <User: User object (25255aa5-31f9-45d4-9158-5225fba24cb3)>, <User: User object (4a24f883-210b-43a6-9b57-f51fac624a09)>, <User: User object (b6b044c6-cf9b-46f9-91d9-3a31cd4f6d30)>]>
对该查询集的切片不一致。
(Pdb) all_users_queryset[0:1] # first slice is giving correct output
<KafkaPushQuerySet [<User: User object (bbc306a1-3a75-4e5e-a9b4-6dce0b31b4b3)>]>
(Pdb) all_users_queryset[1:2] # this should return 2nd obj in the query set but it return 1st object
<KafkaPushQuerySet [<User: User object (bbc306a1-3a75-4e5e-a9b4-6dce0b31b4b3)>]>
(Pdb) all_users_queryset[2:3] # this works fine
<KafkaPushQuerySet [<User: User object (25255aa5-31f9-45d4-9158-5225fba24cb3)>]>
(Pdb) all_users_queryset[0:2] # this is returning 1st two objects but in reversed order.
<KafkaPushQuerySet [<User: User object (8830ecc1-944a-43e3-a701-331122a26c2b)>, <User: User object (bbc306a1-3a75-4e5e-a9b4-6dce0b31b4b3)>]>
(Pdb) all_users_queryset[0:3] # this is returning correct objets and in correct order.
<KafkaPushQuerySet [<User: User object (bbc306a1-3a75-4e5e-a9b4-6dce0b31b4b3)>, <User: User object (8830ecc1-944a-43e3-a701-331122a26c2b)>, <User: User object (25255aa5-31f9-45d4-9158-5225fba24cb3)>]>
导致这种不一致的原因是什么?
context on this pas1_object - 这基本上是一个通过继承 models.Manager 创建的自定义对象。并且我们覆盖了 get_queryset,自定义查询集下面 return -
class KafkaPushQuerySet(models.query.QuerySet):
"""
Overriding queryset function to push to kafka
"""
def delete(self):
self.produce_messages_to_kafka(action="delete")
super(KafkaPushQuerySet, self).delete()
def update(self, **kwargs):
super(KafkaPushQuerySet, self).update(**kwargs)
self.produce_messages_to_kafka(action="update")
谢谢!!!
仅当您对查询集进行切片时才会对其进行评估。换句话说,每次执行切片操作时都会访问数据库(关于这一点,请参见Django documentation)。
那么,问题来了:为什么你的order_by
方法每次评估的效果不一样?根据您的 User
数据库 table 和您的变量 sort_array
.
- 数据库已在查询集的 2 次评估之间更新,
*sort_array
对应的元组不长,足够精确,顺序固定且唯一。来自 documentation:
A particular ordering is guaranteed only when ordering by a set of fields that uniquely identify each object in the results. For example, if a
name
field isn’t unique, ordering by it won’t guarantee objects with the same name always appear in the same order.
一个简单的解决方案是在执行切片之前强制评估查询集,方法是调用 list()
:
my_list = list(User.pas1_objects.select_related("userlogindetails", "usermeta", "tenant")
.filter(data_filter)
.exclude(userrolepermissions__role_id__in=["PG", "PD"], tenant_id__in=tenant_ids)
.order_by(*sort_array)
)
# And then you can slice your list without any risk of inconsistency
my_list[0:1]