(Presto) window 函数 "OVER" 子句中 "ROWS BETWEEN" 和 "RANGE BETWEEN" 的区别
Difference between "ROWS BETWEEN" and "RANGE BETWEEN" in (Presto) window function "OVER" clause
这个问题主要是关于 PrestoSQL 的旧版本,它已经在(现在重命名的)Trino 项目中得到解决。
346. 然而,Amazon 的 Athena 项目基于 Presto 版本 0.217(Athena 引擎 2)和 0.172(Athena 引擎 1),确实存在下述问题。这个问题是专门围绕 Athena Engine 1 / PrestoSQL 版本 0.172
编写的
问题 (tl;dr)
- Presto window 函数中
ROWS BETWEEN
和 RANGE BETWEEN
有什么区别?
- 这些只是彼此的同义词,还是存在核心概念差异?
- 如果它们只是同义词,为什么
ROWS BETWEEN
比 RANGE BETWEEN
允许更多选项?
- 是否存在一种查询场景,可以在
ROWS BETWEEN
和 RANGE BETWEEN
上使用完全相同的参数并得到不同的结果?
- 如果仅使用
unbounded
/current row
,是否存在您会使用 RANGE
而不是 ROWS
的情况(反之亦然)?
- 既然
ROWS
有更多的选项,为什么在文档中完全没有提到呢? o_O
评论
presto documentation 甚至 RANGE
都相当安静,没有提到 ROWS
。我没有在 Presto 中找到很多关于 window 函数的讨论或示例。我开始设置 Presto 代码库来尝试解决这个问题。希望有人能帮我解决这个问题,我们可以一起改进文档。
Presto 代码有 a parser and test cases for the ROWS
variant, but there's no mention in the documentation 个 ROWS
。
我在 ROWS
和 RANGE
中发现的 test cases 没有测试两种语法之间的任何差异。
它们几乎看起来像同义词,但在我的测试中它们确实表现不同,并且具有不同的 allowed parameters and validation rules。
以下示例可以 运行 与 starburstdata/presto Docker 图像 运行ning Presto 0.213-e-0.1。通常我通过 Amazon Athena 运行 Presto 0.172,并且几乎总是最终使用 ROWS
.
范围
RANGE 似乎仅限于“UNBOUNDED”和“CURRENT ROW”。下面returns一个错误:
range between 1 preceding and 1 following
use tpch.tiny;
select custkey, orderdate,
array_agg(orderdate) over (
partition by custkey
order by orderdate asc
range between 1 preceding and 1 following
) previous_orders
from orders where custkey in (419, 320) and orderdate < date('1996-01-01')
order by custkey, orderdate asc;
错误:
Window frame RANGE PRECEDING is only supported with UNBOUNDED
以下范围语法可以正常工作(预期结果不同)。 以下所有示例均基于上述查询,仅更改范围
range between unbounded preceding and current row
custkey | orderdate | previous_orders
---------+------------+--------------------------------------------------------------------------
320 | 1992-07-10 | [1992-07-10]
320 | 1992-07-30 | [1992-07-10, 1992-07-30]
320 | 1994-07-08 | [1992-07-10, 1992-07-30, 1994-07-08]
320 | 1994-08-04 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04]
320 | 1994-09-18 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18]
320 | 1994-10-12 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
419 | 1992-03-16 | [1992-03-16]
419 | 1993-12-29 | [1992-03-16, 1993-12-29]
419 | 1995-01-30 | [1992-03-16, 1993-12-29, 1995-01-30]
range between current row and unbounded following
custkey | orderdate | previous_orders
---------+------------+--------------------------------------------------------------------------
320 | 1992-07-10 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1992-07-30 | [1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-07-08 | [1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-08-04 | [1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-09-18 | [1994-09-18, 1994-10-12]
320 | 1994-10-12 | [1994-10-12]
419 | 1992-03-16 | [1992-03-16, 1993-12-29, 1995-01-30]
419 | 1993-12-29 | [1993-12-29, 1995-01-30]
419 | 1995-01-30 | [1995-01-30]
前无界与后无界之间的范围
custkey | orderdate | previous_orders
---------+------------+--------------------------------------------------------------------------
320 | 1992-07-10 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1992-07-30 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-07-08 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-08-04 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-09-18 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-10-12 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
419 | 1992-03-16 | [1992-03-16, 1993-12-29, 1995-01-30]
419 | 1993-12-29 | [1992-03-16, 1993-12-29, 1995-01-30]
419 | 1995-01-30 | [1992-03-16, 1993-12-29, 1995-01-30]
行数
上面 RANGE
的三个工作示例都适用于 ROWS
并产生相同的输出。
rows between unbounded preceding and current row
rows between current row and unbounded following
rows between unbounded preceding and unbounded following
省略输出 - 与上面相同
但是,ROWS
允许更多的控制,因为您还可以执行上面的语法,但失败 range
:
rows between 1 preceding and 1 following
custkey | orderdate | previous_orders
---------+------------+--------------------------------------
320 | 1992-07-10 | [1992-07-10, 1992-07-30]
320 | 1992-07-30 | [1992-07-10, 1992-07-30, 1994-07-08]
320 | 1994-07-08 | [1992-07-30, 1994-07-08, 1994-08-04]
320 | 1994-08-04 | [1994-07-08, 1994-08-04, 1994-09-18]
320 | 1994-09-18 | [1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-10-12 | [1994-09-18, 1994-10-12]
419 | 1992-03-16 | [1992-03-16, 1993-12-29]
419 | 1993-12-29 | [1992-03-16, 1993-12-29, 1995-01-30]
419 | 1995-01-30 | [1993-12-29, 1995-01-30]
rows between current row and 1 following
custkey | orderdate | previous_orders
---------+------------+--------------------------
320 | 1992-07-10 | [1992-07-10, 1992-07-30]
320 | 1992-07-30 | [1992-07-30, 1994-07-08]
320 | 1994-07-08 | [1994-07-08, 1994-08-04]
320 | 1994-08-04 | [1994-08-04, 1994-09-18]
320 | 1994-09-18 | [1994-09-18, 1994-10-12]
320 | 1994-10-12 | [1994-10-12]
419 | 1992-03-16 | [1992-03-16, 1993-12-29]
419 | 1993-12-29 | [1993-12-29, 1995-01-30]
419 | 1995-01-30 | [1995-01-30]
rows between 5 preceding and 2 preceding
custkey | orderdate | previous_orders
---------+------------+--------------------------------------------------
320 | 1992-07-10 | NULL
320 | 1992-07-30 | NULL
320 | 1994-07-08 | [1992-07-10]
320 | 1994-08-04 | [1992-07-10, 1992-07-30]
320 | 1994-09-18 | [1992-07-10, 1992-07-30, 1994-07-08]
320 | 1994-10-12 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04]
419 | 1992-03-16 | NULL
419 | 1993-12-29 | NULL
419 | 1995-01-30 | [1992-03-16]
ROWS
是您要聚合之前和之后的行数。所以 ORDER BY day ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
将以 3 行结束:curnet 行前 1 行和后 1 行,无论 orderdate 的值如何。
RANGE
将查看 orderdate 的值并决定哪些应该聚合,哪些不应该聚合。所以 ORDER BY day RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING
理论上会采用值为 orderdate-1、orderdate 和 orderdate+1 的所有行——这可以超过 3 行(参见更多解释 here)
在 Presto 中,ROWS
已完全实现,但 RANGE
不知何故仅部分实现,您只能在 CURRENT ROW
和 UNBOUNDED
中使用。
NOTE: Recent versions of Trino (formerly known as Presto SQL) have full
support for RANGE
and GROUPS
framing. See this blog post for
an explanation of how they work.
在 Presto 中,为了能够看到两者之间的差异,最好的方法是确保您具有相同的 order 子句值:
WITH
tt1 (custkey, orderdate, product) AS
( SELECT * FROM ( VALUES ('a','1992-07-10', 3), ('a','1993-08-10', 4), ('a','1994-07-13', 5), ('a','1995-09-13', 5), ('a','1995-09-13', 9), ('a','1997-01-13', 4),
('b','1992-07-10', 6), ('b','1992-07-10', 4), ('b','1994-07-13', 5), ('b','1994-07-13', 9), ('b','1998-11-11', 9) ) )
SELECT *,
array_agg(product) OVER (partition by custkey) c,
array_agg(product) OVER (partition by custkey order by orderdate) c_order,
array_agg(product) OVER (partition by custkey order by orderdate RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) range_ubub,
array_agg(product) OVER (partition by custkey order by orderdate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rows_ubub,
array_agg(product) OVER (partition by custkey order by orderdate RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) range_ubc,
array_agg(product) OVER (partition by custkey order by orderdate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) rows_ubc,
array_agg(product) OVER (partition by custkey order by orderdate RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) range_cub,
array_agg(product) OVER (partition by custkey order by orderdate ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) rows_cub,
-- array_agg(product) OVER (partition by custkey order by orderdate RANGE BETWEEN 2 PRECEDING AND 2 FOLLOWING) range22,
-- SYNTAX_ERROR: line 19:65: Window frame RANGE PRECEDING is only supported with UNBOUNDED
array_agg(product) OVER (partition by custkey order by orderdate ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) rows22
from tt1
order by custkey, orderdate, product
您可以运行,查看完整的结果,并从中学习..
我只会在这里放一些有趣的专栏:
custkey orderdate product range_ubc rows_ubc
a 10/07/1992 3 [3] [3]
a 10/08/1993 4 [3, 4] [3, 4]
a 13/07/1994 5 [3, 4, 5] [3, 4, 5]
a 13/09/1995 5 [3, 4, 5, 5, 9] [3, 4, 5, 5]
a 13/09/1995 9 [3, 4, 5, 5, 9] [3, 4, 5, 5, 9]
a 13/01/1997 4 [3, 4, 5, 5, 9, 4] [3, 4, 5, 5, 9, 4]
b 10/07/1992 4 [6, 4] [6, 4]
b 10/07/1992 6 [6, 4] [6]
b 13/07/1994 5 [6, 4, 5, 9] [6, 4, 5]
b 13/07/1994 9 [6, 4, 5, 9] [6, 4, 5, 9]
b 11/11/1998 9 [6, 4, 5, 9, 9] [6, 4, 5, 9, 9]
如果你看第5行:orderdate:13/09/1995, product:5
(注意:13/09/1995
出现两次 custkey:a
) 你可以看到 ROWS
确实获取了从顶部到当前行的所有行。但是,如果您查看 RANGE
,您会发现它还包括行 after 中的值,因为它具有完全相同的 orderdate
,所以它是 考虑在同一个window。
这个问题主要是关于 PrestoSQL 的旧版本,它已经在(现在重命名的)Trino 项目中得到解决。 346. 然而,Amazon 的 Athena 项目基于 Presto 版本 0.217(Athena 引擎 2)和 0.172(Athena 引擎 1),确实存在下述问题。这个问题是专门围绕 Athena Engine 1 / PrestoSQL 版本 0.172
编写的问题 (tl;dr)
- Presto window 函数中
ROWS BETWEEN
和RANGE BETWEEN
有什么区别?- 这些只是彼此的同义词,还是存在核心概念差异?
- 如果它们只是同义词,为什么
ROWS BETWEEN
比RANGE BETWEEN
允许更多选项?
- 是否存在一种查询场景,可以在
ROWS BETWEEN
和RANGE BETWEEN
上使用完全相同的参数并得到不同的结果?- 如果仅使用
unbounded
/current row
,是否存在您会使用RANGE
而不是ROWS
的情况(反之亦然)?
- 如果仅使用
- 既然
ROWS
有更多的选项,为什么在文档中完全没有提到呢? o_O
评论
presto documentation 甚至 RANGE
都相当安静,没有提到 ROWS
。我没有在 Presto 中找到很多关于 window 函数的讨论或示例。我开始设置 Presto 代码库来尝试解决这个问题。希望有人能帮我解决这个问题,我们可以一起改进文档。
Presto 代码有 a parser and test cases for the ROWS
variant, but there's no mention in the documentation 个 ROWS
。
我在 ROWS
和 RANGE
中发现的 test cases 没有测试两种语法之间的任何差异。
它们几乎看起来像同义词,但在我的测试中它们确实表现不同,并且具有不同的 allowed parameters and validation rules。
以下示例可以 运行 与 starburstdata/presto Docker 图像 运行ning Presto 0.213-e-0.1。通常我通过 Amazon Athena 运行 Presto 0.172,并且几乎总是最终使用 ROWS
.
范围
RANGE 似乎仅限于“UNBOUNDED”和“CURRENT ROW”。下面returns一个错误:
range between 1 preceding and 1 following
use tpch.tiny;
select custkey, orderdate,
array_agg(orderdate) over (
partition by custkey
order by orderdate asc
range between 1 preceding and 1 following
) previous_orders
from orders where custkey in (419, 320) and orderdate < date('1996-01-01')
order by custkey, orderdate asc;
错误:
Window frame RANGE PRECEDING is only supported with UNBOUNDED
以下范围语法可以正常工作(预期结果不同)。 以下所有示例均基于上述查询,仅更改范围
range between unbounded preceding and current row
custkey | orderdate | previous_orders
---------+------------+--------------------------------------------------------------------------
320 | 1992-07-10 | [1992-07-10]
320 | 1992-07-30 | [1992-07-10, 1992-07-30]
320 | 1994-07-08 | [1992-07-10, 1992-07-30, 1994-07-08]
320 | 1994-08-04 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04]
320 | 1994-09-18 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18]
320 | 1994-10-12 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
419 | 1992-03-16 | [1992-03-16]
419 | 1993-12-29 | [1992-03-16, 1993-12-29]
419 | 1995-01-30 | [1992-03-16, 1993-12-29, 1995-01-30]
range between current row and unbounded following
custkey | orderdate | previous_orders
---------+------------+--------------------------------------------------------------------------
320 | 1992-07-10 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1992-07-30 | [1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-07-08 | [1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-08-04 | [1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-09-18 | [1994-09-18, 1994-10-12]
320 | 1994-10-12 | [1994-10-12]
419 | 1992-03-16 | [1992-03-16, 1993-12-29, 1995-01-30]
419 | 1993-12-29 | [1993-12-29, 1995-01-30]
419 | 1995-01-30 | [1995-01-30]
前无界与后无界之间的范围
custkey | orderdate | previous_orders
---------+------------+--------------------------------------------------------------------------
320 | 1992-07-10 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1992-07-30 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-07-08 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-08-04 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-09-18 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-10-12 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04, 1994-09-18, 1994-10-12]
419 | 1992-03-16 | [1992-03-16, 1993-12-29, 1995-01-30]
419 | 1993-12-29 | [1992-03-16, 1993-12-29, 1995-01-30]
419 | 1995-01-30 | [1992-03-16, 1993-12-29, 1995-01-30]
行数
上面 RANGE
的三个工作示例都适用于 ROWS
并产生相同的输出。
rows between unbounded preceding and current row
rows between current row and unbounded following
rows between unbounded preceding and unbounded following
省略输出 - 与上面相同
但是,ROWS
允许更多的控制,因为您还可以执行上面的语法,但失败 range
:
rows between 1 preceding and 1 following
custkey | orderdate | previous_orders
---------+------------+--------------------------------------
320 | 1992-07-10 | [1992-07-10, 1992-07-30]
320 | 1992-07-30 | [1992-07-10, 1992-07-30, 1994-07-08]
320 | 1994-07-08 | [1992-07-30, 1994-07-08, 1994-08-04]
320 | 1994-08-04 | [1994-07-08, 1994-08-04, 1994-09-18]
320 | 1994-09-18 | [1994-08-04, 1994-09-18, 1994-10-12]
320 | 1994-10-12 | [1994-09-18, 1994-10-12]
419 | 1992-03-16 | [1992-03-16, 1993-12-29]
419 | 1993-12-29 | [1992-03-16, 1993-12-29, 1995-01-30]
419 | 1995-01-30 | [1993-12-29, 1995-01-30]
rows between current row and 1 following
custkey | orderdate | previous_orders
---------+------------+--------------------------
320 | 1992-07-10 | [1992-07-10, 1992-07-30]
320 | 1992-07-30 | [1992-07-30, 1994-07-08]
320 | 1994-07-08 | [1994-07-08, 1994-08-04]
320 | 1994-08-04 | [1994-08-04, 1994-09-18]
320 | 1994-09-18 | [1994-09-18, 1994-10-12]
320 | 1994-10-12 | [1994-10-12]
419 | 1992-03-16 | [1992-03-16, 1993-12-29]
419 | 1993-12-29 | [1993-12-29, 1995-01-30]
419 | 1995-01-30 | [1995-01-30]
rows between 5 preceding and 2 preceding
custkey | orderdate | previous_orders
---------+------------+--------------------------------------------------
320 | 1992-07-10 | NULL
320 | 1992-07-30 | NULL
320 | 1994-07-08 | [1992-07-10]
320 | 1994-08-04 | [1992-07-10, 1992-07-30]
320 | 1994-09-18 | [1992-07-10, 1992-07-30, 1994-07-08]
320 | 1994-10-12 | [1992-07-10, 1992-07-30, 1994-07-08, 1994-08-04]
419 | 1992-03-16 | NULL
419 | 1993-12-29 | NULL
419 | 1995-01-30 | [1992-03-16]
ROWS
是您要聚合之前和之后的行数。所以ORDER BY day ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
将以 3 行结束:curnet 行前 1 行和后 1 行,无论 orderdate 的值如何。RANGE
将查看 orderdate 的值并决定哪些应该聚合,哪些不应该聚合。所以ORDER BY day RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING
理论上会采用值为 orderdate-1、orderdate 和 orderdate+1 的所有行——这可以超过 3 行(参见更多解释 here)
在 Presto 中,ROWS
已完全实现,但 RANGE
不知何故仅部分实现,您只能在 CURRENT ROW
和 UNBOUNDED
中使用。
NOTE: Recent versions of Trino (formerly known as Presto SQL) have full support for
RANGE
andGROUPS
framing. See this blog post for an explanation of how they work.
在 Presto 中,为了能够看到两者之间的差异,最好的方法是确保您具有相同的 order 子句值:
WITH
tt1 (custkey, orderdate, product) AS
( SELECT * FROM ( VALUES ('a','1992-07-10', 3), ('a','1993-08-10', 4), ('a','1994-07-13', 5), ('a','1995-09-13', 5), ('a','1995-09-13', 9), ('a','1997-01-13', 4),
('b','1992-07-10', 6), ('b','1992-07-10', 4), ('b','1994-07-13', 5), ('b','1994-07-13', 9), ('b','1998-11-11', 9) ) )
SELECT *,
array_agg(product) OVER (partition by custkey) c,
array_agg(product) OVER (partition by custkey order by orderdate) c_order,
array_agg(product) OVER (partition by custkey order by orderdate RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) range_ubub,
array_agg(product) OVER (partition by custkey order by orderdate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rows_ubub,
array_agg(product) OVER (partition by custkey order by orderdate RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) range_ubc,
array_agg(product) OVER (partition by custkey order by orderdate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) rows_ubc,
array_agg(product) OVER (partition by custkey order by orderdate RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) range_cub,
array_agg(product) OVER (partition by custkey order by orderdate ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) rows_cub,
-- array_agg(product) OVER (partition by custkey order by orderdate RANGE BETWEEN 2 PRECEDING AND 2 FOLLOWING) range22,
-- SYNTAX_ERROR: line 19:65: Window frame RANGE PRECEDING is only supported with UNBOUNDED
array_agg(product) OVER (partition by custkey order by orderdate ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) rows22
from tt1
order by custkey, orderdate, product
您可以运行,查看完整的结果,并从中学习..
我只会在这里放一些有趣的专栏:
custkey orderdate product range_ubc rows_ubc
a 10/07/1992 3 [3] [3]
a 10/08/1993 4 [3, 4] [3, 4]
a 13/07/1994 5 [3, 4, 5] [3, 4, 5]
a 13/09/1995 5 [3, 4, 5, 5, 9] [3, 4, 5, 5]
a 13/09/1995 9 [3, 4, 5, 5, 9] [3, 4, 5, 5, 9]
a 13/01/1997 4 [3, 4, 5, 5, 9, 4] [3, 4, 5, 5, 9, 4]
b 10/07/1992 4 [6, 4] [6, 4]
b 10/07/1992 6 [6, 4] [6]
b 13/07/1994 5 [6, 4, 5, 9] [6, 4, 5]
b 13/07/1994 9 [6, 4, 5, 9] [6, 4, 5, 9]
b 11/11/1998 9 [6, 4, 5, 9, 9] [6, 4, 5, 9, 9]
如果你看第5行:orderdate:13/09/1995, product:5
(注意:13/09/1995
出现两次 custkey:a
) 你可以看到 ROWS
确实获取了从顶部到当前行的所有行。但是,如果您查看 RANGE
,您会发现它还包括行 after 中的值,因为它具有完全相同的 orderdate
,所以它是 考虑在同一个window。