serpy.MethodField 的 Django N+1 问题

Django N+1 problem with serpy.MethodField

我使用 nplusone 来检测 N+1 个查询。

我有一个 serpy 序列化程序,可以序列化 Order 个实例。一个 Order 有一个由 OrderComponent 个实例组成的 cart,如下所示。代码简化:

class Order(models.Model):
    objects = OrderManager()
    listings = models.ManyToManyField(to=Listing, through="OrderComponent", related_name="orders")


class OrderComponent(models.Model):
    listing = models.ForeignKey(to=Listing, on_delete=models.PROTECT, related_name="order_components")
    order = models.ForeignKey(to=Order, on_delete=models.PROTECT, related_name="cart")
    nb_units = models.PositiveIntegerField()

通过OrderListView:

获得序列化的Order个实例列表
class OrderListView(SimplePaginatedListView):  # Custom base class
    model = Order
    serializer_class = OrderSerializer
    deserializer_class = OrderDeserializer

    def get_queryset(self):
        return super().get_queryset().select_related(
            "fulfillment",  # Used for order.is_fulfilled()
            "cancellation",  # Used for order.is_cancelled()
        ).prefetch_related(
            "cart",  # I don't believe the first two are necessary, but added for testing purposes
            "cart__listing",
            "cart__listing__product",
            "cart__listing__service",
        )

    @http_get(required_permissions=ShopRolePermission.MANAGE_ORDERS)
    def get(self, *args, **kwargs):
        return super().get(*args, **kwargs)

    @http_post(required_permissions=ShopRolePermission.MANAGE_ORDERS)
    def post(self, *args, **kwargs):
        return super().post(*args, **kwargs)

OrderSerializer 定义如下。 nplusone 没有说明检测到 N+1 查询的位置,所以我已经注释掉了所有可能的罪魁祸首并找到了真正的罪魁祸首。我已经在评论中指出了它们的位置。

class OrderSerializer(BaseSerializer):
    class SimpleOrderComponentSerializer(BaseSerializer):
        id = serpy.IntField()
        listing_id = serpy.IntField()
        product_name = serpy.MethodField(required=False)  # No N+1
        service_name = serpy.MethodField(required=False)  # No N+1
        nb_units = serpy.IntField()

        def __init__(self, instance=None, many=False, data=None, context=None, **kwargs):
            # Should this be necessary, since I prefetched in get_queryset?
            instance = instance.select_related("listing", "listing__product", "listing__service")
            super().__init__(instance, many, data, context, **kwargs)

        @staticmethod
        def get_product_name(_serializer, obj: OrderComponent):
            # No N+1
            return obj.listing.product.product_name if obj.listing.product else None

        @staticmethod
        def get_service_name(_serializer, obj: OrderComponent):
            # No N+1
            return obj.listing.service.service_name if obj.listing.service else None

    id = serpy.IntField()
    cancelled = serpy.BoolField(attr="is_cancelled")  # Checks if Cancellation instance exists, no N+1
    fulfilled = serpy.BoolField(attr="is_fulfilled")  # Checks if Fulfillment instance exists, no N+1
    cart_products = serpy.MethodField(required=False)  # N+1 !!!
    cart_services = serpy.MethodField(required=False)  # N+1 !!!

    def get_cart_products(self, obj: Order):
        # N+1 detected here.
        return self.SimpleOrderComponentSerializer(obj.cart.filter(listing__product__isnull=False), many=True).data

    def get_cart_services(self, obj: Order):
        # N+1 detected here. Curiously, the number of N+1 queries detected differs between the two.
        return self.SimpleOrderComponentSerializer(obj.cart.filter(listing__service__isnull=False), many=True).data

我一辈子都弄不明白为什么我的预取在这里不起作用。 Django Debug Toolbar 确认不是误报:

SELECT ••• FROM "shop_ordercomponent" INNER JOIN "shop_listing" ON ("shop_ordercomponent"."listing_id" = "shop_listing"."id") INNER JOIN "shop_product" ON ("shop_listing"."product_id" = "shop_product"."id") LEFT OUTER JOIN "shop_service" ON ("shop_listing"."service_id" = "shop_service"."id") WHERE ("shop_ordercomponent"."order_id" = 2 AND "shop_listing"."product_id" IS NOT NULL)
  25 similar queries. 

如果我检查 get_cart_servicesget_cart_products 中的 obj,我会看到 obj._prefetched_objects_cache["cart"]OrderComponentQuerySet。例如,如果我检查 SimpleOrderComponentSerializer.get_product_name 中的 obj,我在 obj._prefetched_objects_cache["cart"] 中什么也看不到,这是我不希望的,因为 listing.productlisting.serviceselect_related-ed,不是 prefetch_related-ed.

我承认我不完全理解这是如何工作的,但我假设 select_related 在一个查询中贪婪地填充一对一关系,而不是懒惰地等待相关对象被请求。

从外观上看,我的初始 prefetch_related 不会“转移”到内部序列化程序的 MethodField 处理程序中。 nplusone 日志:

Potential n+1 query detected on `Order.cart`
Potential n+1 query detected on `Order.cart`
Potential n+1 query detected on `Order.cart`
...  # Repeated many times. Multiple warnings printed for each object.

难道是因为cart是“反”关系?任何帮助理解为什么这不起作用的帮助将不胜感激。

我通过在 get_queryset 中注释所需信息并获取该信息来解决问题。这样,预取信息将存储在每个实例中。