了解在 SQL 查询的自连接中使用 'Between' 条件时的逻辑查询处理
Understand the logical query processing when 'Between' condition is used in a self join of a SQL query
我在 PostgreSQL 数据库中有以下 订单 table
Order_date | Revenue
--------------------
2020-10-01 | 10
2020-10-02 | 5
2020-10-03 | 10
2020-10-04 | 5
2020-10-05 | 10
我需要 return 最近 2 天每个 order_date 的累计收入总额,包括该订单日期的收入。我正在使用以下查询
SELECT o1.order_date,
Sum(o2.revenue) as Revenue_sum
FROM orders o1
JOIN orders o2
ON o1.order_date BETWEEN o2.order_date AND o2.order_date + 2
GROUP BY o1.order_date
ORDER BY o1.order_date
它 return 紧接着结果
Order_date | Revenue_sum
------------------------
2020-10-01 | 10
2020-10-02 | 15
2020-10-03 | 25
2020-10-04 | 20
2020-10-05 | 25
根据查询处理的逻辑顺序,将执行以下步骤
- 'JOIN' 将首先通过执行交叉连接形成笛卡尔积,因此 o1 的每一行都将连接到 o2 的每一行。
- 然后 'ON' 子句中的限定符条件将 select 只有满足条件
的行
- 在 selected 行中,收入将根据 GROUP BY 子句 (o1.order_date)
按每个组进行汇总
根据上面的执行步骤,我正在尝试可视化查询的处理步骤。
第 1 步将是一个交叉连接,如下所示
第 2 步将根据 'ON' 中的条件进行限定。我无法根据 'JOIN' 中指定的条件以及如何
可视化将从步骤 1 编辑的行 select
然后第 3 步将行分组并对收入求和
1.笛卡尔积
o1.order_date | o2.order_date | o2.revenue
-------------------------------------------
2020-10-01 | 2020-10-01 | 10
2020-10-01 | 2020-10-02 | 5
2020-10-01 | 2020-10-03 | 10
2020-10-01 | 2020-10-04 | 5
2020-10-01 | 2020-10-05 | 10
2020-10-02 | 2020-10-01 | 10
2020-10-02 | 2020-10-02 | 5
2020-10-02 | 2020-10-03 | 10
2020-10-02 | 2020-10-04 | 5
2020-10-02 | 2020-10-05 | 10
2020-10-03 | 2020-10-01 | 10
2020-10-03 | 2020-10-02 | 5
2020-10-03 | 2020-10-03 | 10
2020-10-03 | 2020-10-04 | 5
2020-10-03 | 2020-10-05 | 10
2020-10-04 | 2020-10-01 | 10
2020-10-04 | 2020-10-02 | 5
2020-10-04 | 2020-10-03 | 10
2020-10-04 | 2020-10-04 | 5
2020-10-04 | 2020-10-05 | 10
2020-10-05 | 2020-10-01 | 10
2020-10-05 | 2020-10-02 | 5
2020-10-05 | 2020-10-03 | 10
2020-10-05 | 2020-10-04 | 5
2020-10-05 | 2020-10-05 | 10
2。资格基于 'ON' 条件。从上面的第 1 步中 select 编辑了哪些行?
假设您的日期始终按顺序排列,一次一天,您可以使用:
SELECT
Order_date,
SUM(Revenue) OVER (ORDER BY Order_date
ROWS BETWEEN 2 PRECEDING AND
CURRENT ROW) AS Revenue_sum
FROM orders
ORDER BY
Order_date;
执行笛卡尔积后,r1.date 的每个值将与根据您提供的条件定义的 r2.date 范围进行比较(o1.order_date BETWEEN o2.order_date 和 o2.order_date + 2)。请记住,对于 o2.order_date 的每个值,都会重新定义此日期范围。
示例:
当o1.order_date='2020-10-01'时:
- 它将比较o1.order_date是否在'2020-10-01'和'2020-10-03'之间的o2.order_date范围内,条件评估为True,并选择该行来自笛卡尔积。
- 下一次,o2.order_date 范围变为“2020-10-02”和“2020-10-04”,现在 order_date=“2020-10-01”不在范围内这个范围,因此这个条件评估为假。因此,对于 o1.order_date='2020-10-01'.
,仅从笛卡尔积中选择了 1 行(在上一步中提到)
重复上述步骤,除非你的笛卡尔积中的所有行都已被评估,并且只有满足给定日期范围条件的行才会被选择进入 group by 子句以聚合收入。
根据上述步骤,将选择以下行转到 group-by 子句:
o1.order_date | o2.order_date | o2.revenue
-------------------------------------------
2020-10-01 | 2020-10-01 | 10
2020-10-02 | 2020-10-01 | 10
2020-10-02 | 2020-10-02 | 5
2020-10-03 | 2020-10-01 | 10
2020-10-03 | 2020-10-02 | 5
2020-10-03 | 2020-10-03 | 10
...
以演示为目的,仅供演示。这并不声称是 Postgres 遵循的物理过程,但应该允许可视化。从笛卡尔积开始,扩展 2 列。谓词 o2.order_date+2。还有一个真相 table 评估你的 ON 谓词(o1.order_date BETWEEN o2.order_date AND o2.order_date + 2)。这样一来,您 select 只有那些具有真值结果的行。
+---------------+---------------+-----------------+-----------------------------+
| o1.order_date | o2.order_date | o2.order_date+2 | od1 >= od2 and od1 <= od2+2 |
+---------------+---------------+-----------------+-----------------------------+
| 2020-10-01 | 2020-10-01 | 2020-10-03 | true |
| 2020-10-01 | 2020-10-02 | 2020-10-04 | false |
| 2020-10-01 | 2020-10-03 | 2020-10-05 | false |
| 2020-10-01 | 2020-10-04 | 2020-10-06 | false |
| 2020-10-01 | 2020-10-05 | 2020-10-07 | false |
| 2020-10-02 | 2020-10-01 | 2020-10-03 | true |
| 2020-10-02 | 2020-10-02 | 2020-10-04 | true |
| 2020-10-02 | 2020-10-03 | 2020-10-05 | false |
| 2020-10-02 | 2020-10-04 | 2020-10-06 | false |
| 2020-10-02 | 2020-10-05 | 2020-10-07 | false |
| 2020-10-03 | 2020-10-01 | 2020-10-03 | true |
| 2020-10-03 | 2020-10-02 | 2020-10-04 | true |
| 2020-10-03 | 2020-10-03 | 2020-10-05 | true |
| 2020-10-03 | 2020-10-04 | 2020-10-06 | false |
| 2020-10-03 | 2020-10-05 | 2020-10-07 | false |
| 2020-10-04 | 2020-10-01 | 2020-10-03 | false |
| 2020-10-04 | 2020-10-02 | 2020-10-04 | true |
| 2020-10-04 | 2020-10-03 | 2020-10-05 | true |
| 2020-10-04 | 2020-10-04 | 2020-10-06 | true |
| 2020-10-04 | 2020-10-05 | 2020-10-07 | false |
| 2020-10-05 | 2020-10-01 | 2020-10-03 | false |
| 2020-10-05 | 2020-10-02 | 2020-10-04 | false |
| 2020-10-05 | 2020-10-03 | 2020-10-05 | true |
| 2020-10-05 | 2020-10-04 | 2020-10-06 | true |
| 2020-10-05 | 2020-10-05 | 2020-10-07 | true |
+---------------+---------------+-----------------+-----------------------------+
有结果
+---------------+---------------+-----------------+-----------------------------+
| o1.order_date | o2.order_date | o2.order_date+2 | od1 >= od2 and od1 <= od2+2 |
+---------------+---------------+-----------------+-----------------------------+
| 2020-10-01 | 2020-10-01 | 2020-10-03 | true |
| 2020-10-02 | 2020-10-01 | 2020-10-03 | true |
| 2020-10-02 | 2020-10-02 | 2020-10-04 | true |
| 2020-10-03 | 2020-10-01 | 2020-10-03 | true |
| 2020-10-03 | 2020-10-02 | 2020-10-04 | true |
| 2020-10-03 | 2020-10-03 | 2020-10-05 | true |
| 2020-10-04 | 2020-10-02 | 2020-10-04 | true |
| 2020-10-04 | 2020-10-03 | 2020-10-05 | true |
| 2020-10-04 | 2020-10-04 | 2020-10-06 | true |
| 2020-10-05 | 2020-10-03 | 2020-10-05 | true |
| 2020-10-05 | 2020-10-04 | 2020-10-06 | true |
| 2020-10-05 | 2020-10-05 | 2020-10-07 | true |
+---------------+---------------+-----------------+-----------------------------+
最后收集每个日期并对收入值求和。
我在 PostgreSQL 数据库中有以下 订单 table
Order_date | Revenue
--------------------
2020-10-01 | 10
2020-10-02 | 5
2020-10-03 | 10
2020-10-04 | 5
2020-10-05 | 10
我需要 return 最近 2 天每个 order_date 的累计收入总额,包括该订单日期的收入。我正在使用以下查询
SELECT o1.order_date,
Sum(o2.revenue) as Revenue_sum
FROM orders o1
JOIN orders o2
ON o1.order_date BETWEEN o2.order_date AND o2.order_date + 2
GROUP BY o1.order_date
ORDER BY o1.order_date
它 return 紧接着结果
Order_date | Revenue_sum
------------------------
2020-10-01 | 10
2020-10-02 | 15
2020-10-03 | 25
2020-10-04 | 20
2020-10-05 | 25
根据查询处理的逻辑顺序,将执行以下步骤
- 'JOIN' 将首先通过执行交叉连接形成笛卡尔积,因此 o1 的每一行都将连接到 o2 的每一行。
- 然后 'ON' 子句中的限定符条件将 select 只有满足条件 的行
- 在 selected 行中,收入将根据 GROUP BY 子句 (o1.order_date) 按每个组进行汇总
根据上面的执行步骤,我正在尝试可视化查询的处理步骤。
第 1 步将是一个交叉连接,如下所示
第 2 步将根据 'ON' 中的条件进行限定。我无法根据 'JOIN' 中指定的条件以及如何
可视化将从步骤 1 编辑的行 select
然后第 3 步将行分组并对收入求和
1.笛卡尔积
o1.order_date | o2.order_date | o2.revenue
-------------------------------------------
2020-10-01 | 2020-10-01 | 10
2020-10-01 | 2020-10-02 | 5
2020-10-01 | 2020-10-03 | 10
2020-10-01 | 2020-10-04 | 5
2020-10-01 | 2020-10-05 | 10
2020-10-02 | 2020-10-01 | 10
2020-10-02 | 2020-10-02 | 5
2020-10-02 | 2020-10-03 | 10
2020-10-02 | 2020-10-04 | 5
2020-10-02 | 2020-10-05 | 10
2020-10-03 | 2020-10-01 | 10
2020-10-03 | 2020-10-02 | 5
2020-10-03 | 2020-10-03 | 10
2020-10-03 | 2020-10-04 | 5
2020-10-03 | 2020-10-05 | 10
2020-10-04 | 2020-10-01 | 10
2020-10-04 | 2020-10-02 | 5
2020-10-04 | 2020-10-03 | 10
2020-10-04 | 2020-10-04 | 5
2020-10-04 | 2020-10-05 | 10
2020-10-05 | 2020-10-01 | 10
2020-10-05 | 2020-10-02 | 5
2020-10-05 | 2020-10-03 | 10
2020-10-05 | 2020-10-04 | 5
2020-10-05 | 2020-10-05 | 10
2。资格基于 'ON' 条件。从上面的第 1 步中 select 编辑了哪些行?
假设您的日期始终按顺序排列,一次一天,您可以使用:
SELECT
Order_date,
SUM(Revenue) OVER (ORDER BY Order_date
ROWS BETWEEN 2 PRECEDING AND
CURRENT ROW) AS Revenue_sum
FROM orders
ORDER BY
Order_date;
执行笛卡尔积后,r1.date 的每个值将与根据您提供的条件定义的 r2.date 范围进行比较(o1.order_date BETWEEN o2.order_date 和 o2.order_date + 2)。请记住,对于 o2.order_date 的每个值,都会重新定义此日期范围。
示例: 当o1.order_date='2020-10-01'时:
- 它将比较o1.order_date是否在'2020-10-01'和'2020-10-03'之间的o2.order_date范围内,条件评估为True,并选择该行来自笛卡尔积。
- 下一次,o2.order_date 范围变为“2020-10-02”和“2020-10-04”,现在 order_date=“2020-10-01”不在范围内这个范围,因此这个条件评估为假。因此,对于 o1.order_date='2020-10-01'. ,仅从笛卡尔积中选择了 1 行(在上一步中提到)
重复上述步骤,除非你的笛卡尔积中的所有行都已被评估,并且只有满足给定日期范围条件的行才会被选择进入 group by 子句以聚合收入。
根据上述步骤,将选择以下行转到 group-by 子句:
o1.order_date | o2.order_date | o2.revenue
-------------------------------------------
2020-10-01 | 2020-10-01 | 10
2020-10-02 | 2020-10-01 | 10
2020-10-02 | 2020-10-02 | 5
2020-10-03 | 2020-10-01 | 10
2020-10-03 | 2020-10-02 | 5
2020-10-03 | 2020-10-03 | 10
...
以演示为目的,仅供演示。这并不声称是 Postgres 遵循的物理过程,但应该允许可视化。从笛卡尔积开始,扩展 2 列。谓词 o2.order_date+2。还有一个真相 table 评估你的 ON 谓词(o1.order_date BETWEEN o2.order_date AND o2.order_date + 2)。这样一来,您 select 只有那些具有真值结果的行。
+---------------+---------------+-----------------+-----------------------------+
| o1.order_date | o2.order_date | o2.order_date+2 | od1 >= od2 and od1 <= od2+2 |
+---------------+---------------+-----------------+-----------------------------+
| 2020-10-01 | 2020-10-01 | 2020-10-03 | true |
| 2020-10-01 | 2020-10-02 | 2020-10-04 | false |
| 2020-10-01 | 2020-10-03 | 2020-10-05 | false |
| 2020-10-01 | 2020-10-04 | 2020-10-06 | false |
| 2020-10-01 | 2020-10-05 | 2020-10-07 | false |
| 2020-10-02 | 2020-10-01 | 2020-10-03 | true |
| 2020-10-02 | 2020-10-02 | 2020-10-04 | true |
| 2020-10-02 | 2020-10-03 | 2020-10-05 | false |
| 2020-10-02 | 2020-10-04 | 2020-10-06 | false |
| 2020-10-02 | 2020-10-05 | 2020-10-07 | false |
| 2020-10-03 | 2020-10-01 | 2020-10-03 | true |
| 2020-10-03 | 2020-10-02 | 2020-10-04 | true |
| 2020-10-03 | 2020-10-03 | 2020-10-05 | true |
| 2020-10-03 | 2020-10-04 | 2020-10-06 | false |
| 2020-10-03 | 2020-10-05 | 2020-10-07 | false |
| 2020-10-04 | 2020-10-01 | 2020-10-03 | false |
| 2020-10-04 | 2020-10-02 | 2020-10-04 | true |
| 2020-10-04 | 2020-10-03 | 2020-10-05 | true |
| 2020-10-04 | 2020-10-04 | 2020-10-06 | true |
| 2020-10-04 | 2020-10-05 | 2020-10-07 | false |
| 2020-10-05 | 2020-10-01 | 2020-10-03 | false |
| 2020-10-05 | 2020-10-02 | 2020-10-04 | false |
| 2020-10-05 | 2020-10-03 | 2020-10-05 | true |
| 2020-10-05 | 2020-10-04 | 2020-10-06 | true |
| 2020-10-05 | 2020-10-05 | 2020-10-07 | true |
+---------------+---------------+-----------------+-----------------------------+
有结果
+---------------+---------------+-----------------+-----------------------------+
| o1.order_date | o2.order_date | o2.order_date+2 | od1 >= od2 and od1 <= od2+2 |
+---------------+---------------+-----------------+-----------------------------+
| 2020-10-01 | 2020-10-01 | 2020-10-03 | true |
| 2020-10-02 | 2020-10-01 | 2020-10-03 | true |
| 2020-10-02 | 2020-10-02 | 2020-10-04 | true |
| 2020-10-03 | 2020-10-01 | 2020-10-03 | true |
| 2020-10-03 | 2020-10-02 | 2020-10-04 | true |
| 2020-10-03 | 2020-10-03 | 2020-10-05 | true |
| 2020-10-04 | 2020-10-02 | 2020-10-04 | true |
| 2020-10-04 | 2020-10-03 | 2020-10-05 | true |
| 2020-10-04 | 2020-10-04 | 2020-10-06 | true |
| 2020-10-05 | 2020-10-03 | 2020-10-05 | true |
| 2020-10-05 | 2020-10-04 | 2020-10-06 | true |
| 2020-10-05 | 2020-10-05 | 2020-10-07 | true |
+---------------+---------------+-----------------+-----------------------------+
最后收集每个日期并对收入值求和。