了解在 SQL 查询的自连接中使用 'Between' 条件时的逻辑查询处理

Understand the logical query processing when 'Between' condition is used in a self join of a SQL query

我在 PostgreSQL 数据库中有以下 订单 table

Order_date | Revenue
--------------------
2020-10-01 | 10

2020-10-02 | 5

2020-10-03 | 10

2020-10-04 | 5

2020-10-05 | 10

我需要 return 最近 2 天每个 order_date 的累计收入总额,包括该订单日期的收入。我正在使用以下查询

SELECT o1.order_date,  
       Sum(o2.revenue) as Revenue_sum 
FROM   orders o1  
       JOIN orders o2   
         ON o1.order_date BETWEEN o2.order_date AND o2.order_date + 2  
GROUP  BY o1.order_date 
ORDER  BY o1.order_date  

它 return 紧接着结果

Order_date | Revenue_sum 
------------------------
2020-10-01 | 10

2020-10-02 | 15

2020-10-03 | 25

2020-10-04 | 20

2020-10-05 | 25

根据查询处理的逻辑顺序,将执行以下步骤

  1. 'JOIN' 将首先通过执行交叉连接形成笛卡尔积,因此 o1 的每一行都将连接到 o2 的每一行。
  2. 然后 'ON' 子句中的限定符条件将 select 只有满足条件
  3. 的行
  4. 在 selected 行中,收入将根据 GROUP BY 子句 (o1.order_date)
  5. 按每个组进行汇总

根据上面的执行步骤,我正在尝试可视化查询的处理步骤。
第 1 步将是一个交叉连接,如下所示
第 2 步将根据 'ON' 中的条件进行限定。我无法根据 'JOIN' 中指定的条件以及如何
可视化将从步骤 1 编辑的行 select 然后第 3 步将行分组并对收入求和

1.笛卡尔积

    o1.order_date | o2.order_date | o2.revenue
   -------------------------------------------
    2020-10-01    | 2020-10-01    | 10  
    2020-10-01    | 2020-10-02    | 5
    2020-10-01    | 2020-10-03    | 10
    2020-10-01    | 2020-10-04    | 5
    2020-10-01    | 2020-10-05    | 10
    2020-10-02    | 2020-10-01    | 10  
    2020-10-02    | 2020-10-02    | 5
    2020-10-02    | 2020-10-03    | 10
    2020-10-02    | 2020-10-04    | 5
    2020-10-02    | 2020-10-05    | 10
    2020-10-03    | 2020-10-01    | 10  
    2020-10-03    | 2020-10-02    | 5
    2020-10-03    | 2020-10-03    | 10
    2020-10-03    | 2020-10-04    | 5
    2020-10-03    | 2020-10-05    | 10
    2020-10-04    | 2020-10-01    | 10  
    2020-10-04    | 2020-10-02    | 5
    2020-10-04    | 2020-10-03    | 10
    2020-10-04    | 2020-10-04    | 5
    2020-10-04    | 2020-10-05    | 10
    2020-10-05    | 2020-10-01    | 10  
    2020-10-05    | 2020-10-02    | 5
    2020-10-05    | 2020-10-03    | 10
    2020-10-05    | 2020-10-04    | 5
    2020-10-05    | 2020-10-05    | 10

2。资格基于 'ON' 条件。从上面的第 1 步中 select 编辑了哪些行?

假设您的日期始终按顺序排列,一次一天,您可以使用:

SELECT
    Order_date,
    SUM(Revenue) OVER (ORDER BY Order_date
                       ROWS BETWEEN 2 PRECEDING AND
                            CURRENT ROW) AS Revenue_sum
FROM orders
ORDER BY
    Order_date;

Demo

执行笛卡尔积后,r1.date 的每个值将与根据您提供的条件定义的 r2.date 范围进行比较(o1.order_date BETWEEN o2.order_date 和 o2.order_date + 2)。请记住,对于 o2.order_date 的每个值,都会重新定义此日期范围。

示例: 当o1.order_date='2020-10-01'时:

  • 它将比较o1.order_date是否在'2020-10-01'和'2020-10-03'之间的o2.order_date范围内,条件评估为True,并选择该行来自笛卡尔积。
  • 下一次,o2.order_date 范围变为“2020-10-02”和“2020-10-04”,现在 order_date=“2020-10-01”不在范围内这个范围,因此这个条件评估为假。因此,对于 o1.order_date='2020-10-01'.
  • ,仅从笛卡尔积中选择了 1 行(在上一步中提到)

重复上述步骤,除非你的笛卡尔积中的所有行都已被评估,并且只有满足给定日期范围条件的行才会被选择进入 group by 子句以聚合收入。

根据上述步骤,将选择以下行转到 group-by 子句:

o1.order_date | o2.order_date | o2.revenue
-------------------------------------------
2020-10-01    | 2020-10-01    | 10  
2020-10-02    | 2020-10-01    | 10  
2020-10-02    | 2020-10-02    | 5
2020-10-03    | 2020-10-01    | 10  
2020-10-03    | 2020-10-02    | 5
2020-10-03    | 2020-10-03    | 10
...

以演示为目的,仅供演示。这并不声称是 Postgres 遵循的物理过程,但应该允许可视化。从笛卡尔积开始,扩展 2 列。谓词 o2.order_date+2。还有一个真相 table 评估你的 ON 谓词(o1.order_date BETWEEN o2.order_date AND o2.order_date + 2)。这样一来,您 select 只有那些具有真值结果的行。

+---------------+---------------+-----------------+-----------------------------+
| o1.order_date | o2.order_date | o2.order_date+2 | od1 >= od2 and od1 <= od2+2 |
+---------------+---------------+-----------------+-----------------------------+
| 2020-10-01    | 2020-10-01    | 2020-10-03      | true                        |
| 2020-10-01    | 2020-10-02    | 2020-10-04      | false                       |
| 2020-10-01    | 2020-10-03    | 2020-10-05      | false                       |
| 2020-10-01    | 2020-10-04    | 2020-10-06      | false                       |
| 2020-10-01    | 2020-10-05    | 2020-10-07      | false                       |
| 2020-10-02    | 2020-10-01    | 2020-10-03      | true                        |
| 2020-10-02    | 2020-10-02    | 2020-10-04      | true                        |
| 2020-10-02    | 2020-10-03    | 2020-10-05      | false                       |
| 2020-10-02    | 2020-10-04    | 2020-10-06      | false                       |
| 2020-10-02    | 2020-10-05    | 2020-10-07      | false                       |
| 2020-10-03    | 2020-10-01    | 2020-10-03      | true                        |
| 2020-10-03    | 2020-10-02    | 2020-10-04      | true                        |
| 2020-10-03    | 2020-10-03    | 2020-10-05      | true                        |
| 2020-10-03    | 2020-10-04    | 2020-10-06      | false                       |
| 2020-10-03    | 2020-10-05    | 2020-10-07      | false                       |
| 2020-10-04    | 2020-10-01    | 2020-10-03      | false                       |
| 2020-10-04    | 2020-10-02    | 2020-10-04      | true                        |
| 2020-10-04    | 2020-10-03    | 2020-10-05      | true                        |
| 2020-10-04    | 2020-10-04    | 2020-10-06      | true                        |
| 2020-10-04    | 2020-10-05    | 2020-10-07      | false                       |
| 2020-10-05    | 2020-10-01    | 2020-10-03      | false                       |
| 2020-10-05    | 2020-10-02    | 2020-10-04      | false                       |
| 2020-10-05    | 2020-10-03    | 2020-10-05      | true                        |
| 2020-10-05    | 2020-10-04    | 2020-10-06      | true                        |
| 2020-10-05    | 2020-10-05    | 2020-10-07      | true                        |
+---------------+---------------+-----------------+-----------------------------+

有结果

+---------------+---------------+-----------------+-----------------------------+
| o1.order_date | o2.order_date | o2.order_date+2 | od1 >= od2 and od1 <= od2+2 |
+---------------+---------------+-----------------+-----------------------------+
| 2020-10-01    | 2020-10-01    | 2020-10-03      | true                        |
| 2020-10-02    | 2020-10-01    | 2020-10-03      | true                        |
| 2020-10-02    | 2020-10-02    | 2020-10-04      | true                        |
| 2020-10-03    | 2020-10-01    | 2020-10-03      | true                        |
| 2020-10-03    | 2020-10-02    | 2020-10-04      | true                        |
| 2020-10-03    | 2020-10-03    | 2020-10-05      | true                        |
| 2020-10-04    | 2020-10-02    | 2020-10-04      | true                        |
| 2020-10-04    | 2020-10-03    | 2020-10-05      | true                        |
| 2020-10-04    | 2020-10-04    | 2020-10-06      | true                        |
| 2020-10-05    | 2020-10-03    | 2020-10-05      | true                        |
| 2020-10-05    | 2020-10-04    | 2020-10-06      | true                        |
| 2020-10-05    | 2020-10-05    | 2020-10-07      | true                        |
+---------------+---------------+-----------------+-----------------------------+

最后收集每个日期并对收入值求和。