提高 MySQL 左连接子查询的性能

Improving the performance of a MySQL left join sub query

我有以下 MySQL 查询,它计算给定日期范围内每个月的订单总数,例如一年。查询工作正常,但性能较慢(大约 250 毫秒)。

关于如何重写它以提高效率有什么想法吗?

WITH recursive `dates` AS (
    (
        SELECT '2019-11-28' AS item
    )
    UNION
    ALL (
        SELECT
            item + INTERVAL 1 DAY
        FROM
            `dates`
        WHERE
            item + INTERVAL 1 DAY <= '2020-11-27'
    )
)
SELECT
    DATE_FORMAT(`item`, '%b %y') AS `date`,
    COUNT(`orders`.`id`) AS `total`
FROM
    `dates`
    LEFT JOIN (
        SELECT
            `orders`.`id`,
            `orders`.`created_at`
        FROM
            `orders`
            INNER JOIN `locations` ON `orders`.`location_id` = `locations`.`id`
        WHERE
            `orders`.`shop_id` = 10379184
            AND `locations`.`country_id` = 128
            AND `orders`.`created_at` >= '2019-11-28 12:01:42'
            AND `orders`.`created_at` <= '2020-11-27 12:01:42'
    ) AS `orders` ON DATE(`orders`.`created_at`) = `dates`.`item`
GROUP BY
    `date`

更新:有些人建议使用两个左连接,但是如果我这样做,则不会应用 country_id 过滤器:

WITH recursive `dates` AS (
    (
        SELECT
            '2019-11-28' AS item
    )
    UNION
    ALL (
        SELECT
            item + INTERVAL 1 DAY
        FROM
            `dates`
        WHERE
            item + INTERVAL 1 DAY <= '2020-11-27'
    )
)
SELECT
    DATE_FORMAT(`item`, '%b %y') AS `date`,
    COUNT(`orders`.`id`) AS `total`
FROM
    `dates`
    LEFT JOIN `orders` USE INDEX (`orders_created_at_index`) ON DATE(`created_at`) = `dates`.`item`
    AND `orders`.`shop_id` = 10379184
    AND `orders`.`created_at` >= '2019-11-28 12:22:43'
    AND `orders`.`created_at` <= '2020-11-27 12:22:43'
    LEFT JOIN `locations` ON `orders`.`location_id` = `locations`.`id`
    AND `locations`.`country_id` = 128
GROUP BY
    `date`

谢谢!

我建议使用相关子查询:

SELECT DATE_FORMAT(d.item, '%b %y') AS `date`,
       (SELECT COUNT(*)
        FROM orders o JOIN
             locations l
             ON o.location_id = l.id
        WHERE shop_id = 10379184 AND
              country_id = 128 AND
              o.created_at >= d.item AND
              o.created_at < d.item + interval 1 day
       ) as total
FROM dates d;

这避免了外部聚合,这通常是一种性能改进。

此外,索引可能有助于查询,但不清楚 country_idshop_id 等列的来源。

经过多次修改,我制作了以下运行时间不到 40 毫秒的程序,这足以满足我的需求。我仍然认为它不理想,欢迎任何改进...

SELECT
    `date`,
    COUNT(`order`)
FROM
    (
        WITH recursive `dates` AS (
            (
                SELECT
                    '2019-11-28' AS item
            )
            UNION
            ALL (
                SELECT
                    item + INTERVAL 1 DAY
                FROM
                    `dates`
                WHERE
                    item + INTERVAL 1 DAY <= '2020-11-27'
            )
        )
        SELECT
            DATE_FORMAT(`item`, '%b %y') AS `DATE`,
            `orders`.`id` AS `order`,
            `locations`.`id` AS `location`
        FROM
            `dates`
        LEFT JOIN 
            `orders` 
        ON 
            DATE(`created_at`) = `dates`.`item`
        AND 
            `orders`.`shop_id` = 10379184
        AND 
            `orders`.`created_at` >= '2019-11-28 12:22:43'
        AND 
            `orders`.`created_at` <= '2020-11-27 12:22:43'
        LEFT JOIN 
            `locations` 
        ON 
            `orders`.`location_id` = `locations`.`id`
        AND 
            `locations`.`country_id` = 209
    ) AS items
WHERE
    (
        `order` IS NULL
        AND `location` IS NULL
    )
    OR (
        `order` IS NOT NULL
        AND `location` IS NOT NULL
    )
GROUP BY
    `date`