无法重置分区组(Window 函数,特别是 PostgreSQL)
Unable to reset partition groups (Window functions and PostgreSQL specifically)
我有一个像这样的简单数据集:
SELECT UNNEST(ARRAY['A', 'A', 'A', 'B', 'B', 'A', 'C', 'B']) AS customer_name, generate_series(8, 1, -1) AS order_time;
+-------------+------------+
| customer_id | order_time |
+-------------+------------+
| "A" | 8 |
+-------------+------------+
| "A" | 7 |
+-------------+------------+
| "A" | 6 |
+-------------+------------+
| "B" | 5 |
+-------------+------------+
| "B" | 4 |
+-------------+------------+
| "A" | 3 |
+-------------+------------+
| "C" | 2 |
+-------------+------------+
| "B" | 1 |
+-------------+------------+
我要找一排:
+-------------+------------+
| customer_id | order_time |
+-------------+------------+
| "A" | 6 |
+-------------+------------+
也就是说,我要获取最新(连续)customer_id
的第一个order_time
。使用以下 SQL,我只从 customer_id
A 获得“3”作为 order_time
。我似乎无法 "reset" 分区。
SELECT customer_name, LAST_VALUE(order_time) OVER W
FROM
(
SELECT UNNEST(ARRAY['A', 'A', 'A', 'B', 'B', 'A', 'C', 'B']) AS customer_name, generate_series(8, 1, -1) AS order_time
) X
WINDOW W AS (PARTITION BY customer_name ORDER BY order_time DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY order_time DESC
LIMIT 1;
使用 PostgreSQL11.5
您可以使用
之间的差异
ROW_NUMBER() OVER (ORDER BY order_time DESC)
和
ROW_NUMBER() OVER (PARTITION BY customer_name ORDER BY order_time DESC)
为 gaps-and-islands
结构提供分组:
SELECT XX.customer_name, LAST_VALUE(order_time) OVER W FROM
(
SELECT X.*, ROW_NUMBER() OVER (ORDER BY order_time DESC)-
ROW_NUMBER() OVER (PARTITION BY customer_name ORDER BY order_time DESC)
AS rn
FROM
(
SELECT UNNEST(ARRAY['A', 'A', 'A', 'B', 'B', 'A', 'C', 'B']) AS customer_name,
generate_series(8, 1, -1) AS order_time
) X
) XX
WINDOW W AS (PARTITION BY rn ORDER BY order_time DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
LIMIT 1;
我有一个像这样的简单数据集:
SELECT UNNEST(ARRAY['A', 'A', 'A', 'B', 'B', 'A', 'C', 'B']) AS customer_name, generate_series(8, 1, -1) AS order_time;
+-------------+------------+
| customer_id | order_time |
+-------------+------------+
| "A" | 8 |
+-------------+------------+
| "A" | 7 |
+-------------+------------+
| "A" | 6 |
+-------------+------------+
| "B" | 5 |
+-------------+------------+
| "B" | 4 |
+-------------+------------+
| "A" | 3 |
+-------------+------------+
| "C" | 2 |
+-------------+------------+
| "B" | 1 |
+-------------+------------+
我要找一排:
+-------------+------------+
| customer_id | order_time |
+-------------+------------+
| "A" | 6 |
+-------------+------------+
也就是说,我要获取最新(连续)customer_id
的第一个order_time
。使用以下 SQL,我只从 customer_id
A 获得“3”作为 order_time
。我似乎无法 "reset" 分区。
SELECT customer_name, LAST_VALUE(order_time) OVER W
FROM
(
SELECT UNNEST(ARRAY['A', 'A', 'A', 'B', 'B', 'A', 'C', 'B']) AS customer_name, generate_series(8, 1, -1) AS order_time
) X
WINDOW W AS (PARTITION BY customer_name ORDER BY order_time DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY order_time DESC
LIMIT 1;
使用 PostgreSQL11.5
您可以使用
之间的差异ROW_NUMBER() OVER (ORDER BY order_time DESC)
和
ROW_NUMBER() OVER (PARTITION BY customer_name ORDER BY order_time DESC)
为 gaps-and-islands
结构提供分组:
SELECT XX.customer_name, LAST_VALUE(order_time) OVER W FROM
(
SELECT X.*, ROW_NUMBER() OVER (ORDER BY order_time DESC)-
ROW_NUMBER() OVER (PARTITION BY customer_name ORDER BY order_time DESC)
AS rn
FROM
(
SELECT UNNEST(ARRAY['A', 'A', 'A', 'B', 'B', 'A', 'C', 'B']) AS customer_name,
generate_series(8, 1, -1) AS order_time
) X
) XX
WINDOW W AS (PARTITION BY rn ORDER BY order_time DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
LIMIT 1;