HAWQ PostgreSQL - 基于前一行的增量行
HAWQ PostgreSQL - Increment row based on previous row
我需要从这个 table1 创建一个 table2 尝试更新下面的 table :
TABLE1:
ID Rank Event
123456 1 178
123456 2
123456 3
123456 4 155
123456 5
123456 6 192
123456 7
356589 1 165
356589 2
356589 3
356589 4 166
565984 1 1025
565984 2
987456 1 85
987456 2
987456 3
987456 4 22
987456 5
987456 6
正在尝试根据先前的值填充 'Event' 列(如 Excel 中的 Ctrl+D)
TABLE2:
ID Rank Event
123456 1 178
123456 2 178
123456 3 178
123456 4 155
123456 5 155
123456 6 192
123456 7 192
356589 1 165
356589 2 165
356589 3 165
356589 4 166
565984 1 1025
565984 2 1025
987456 1 85
987456 2 85
987456 3 85
987456 4 22
987456 5 22
987456 6 22
问题是事件不遵循顺序并且计数(ID,排名)也不是常量。
我不能尝试使用基于变量的函数,因为它有数百万条记录,也不能使用 'update' 作为它的 Hawq。
有什么建议吗?
欣赏!
您可以尝试将 max
与 window 函数一起使用
CREATE TABLE T(ID int, Rank int, Event varchar(50));
INSERT INTO T VALUES (123456, 1,'178');
INSERT INTO T VALUES (123456, 2,'');
INSERT INTO T VALUES (123456, 3,'');
INSERT INTO T VALUES (123456, 4,'');
INSERT INTO T VALUES (123456, 5,'');
INSERT INTO T VALUES (123456, 6,'');
INSERT INTO T VALUES (123456, 7,'');
INSERT INTO T VALUES (356589, 1,'165');
INSERT INTO T VALUES (356589, 2,'');
INSERT INTO T VALUES (356589, 3,'');
INSERT INTO T VALUES (356589, 4,'');
INSERT INTO T VALUES (565984, 1,'1025');
INSERT INTO T VALUES (565984, 2,'');
INSERT INTO T VALUES (987456, 1,'85');
INSERT INTO T VALUES (987456, 2,'');
INSERT INTO T VALUES (987456, 3,'');
INSERT INTO T VALUES (987456, 4,'');
INSERT INTO T VALUES (987456, 5,'');
INSERT INTO T VALUES (987456, 6,'');
查询 1:
SELECT t.id,t.rank,max(Event) over (partition by ID order by Rank)
FROM T
| id | rank | max |
|--------|------|------|
| 123456 | 1 | 178 |
| 123456 | 2 | 178 |
| 123456 | 3 | 178 |
| 123456 | 4 | 178 |
| 123456 | 5 | 178 |
| 123456 | 6 | 178 |
| 123456 | 7 | 178 |
| 356589 | 1 | 165 |
| 356589 | 2 | 165 |
| 356589 | 3 | 165 |
| 356589 | 4 | 165 |
| 565984 | 1 | 1025 |
| 565984 | 2 | 1025 |
| 987456 | 1 | 85 |
| 987456 | 2 | 85 |
| 987456 | 3 | 85 |
| 987456 | 4 | 85 |
| 987456 | 5 | 85 |
| 987456 | 6 | 85 |
您可以使用 FIRST_VALUE
:
SELECT ID, RANK,
FIRST_VALUE(Event) OVER(PARTITION BY ID ORDER BY Rank) AS Event
FROM tab;
编辑:
Apologies! Each ID has multiple Event codes.
您可以通过额外的分组来处理它:
WITH cte AS (
SELECT ID, RANK, EVENT,
SUM(CASE WHEN event IS NULL THEN 0 ELSE 1 END)
OVER(PARTITION BY ID ORDER BY RANK) AS grp
FROM t
)
SELECT ID, RANK,
FIRST_VALUE(Event) OVER(PARTITION BY ID, grp ORDER BY Rank) AS Event
FROM cte;
无需使用数组操作的子查询也可以做到这一点:
select v.*,
(array_remove(array_agg(event) over (partition by id order by rank), NULL))[count(event) over (partition by id order by rank)]
from (values (1, 'a', 15),
(2, 'a', null),
(3, 'a', null),
(4, 'a', 20),
(5, 'a', null),
(1, 'b', 4),
(2, 'a', null)
) v(rank, id, event)
order by id, rank
我需要从这个 table1 创建一个 table2 尝试更新下面的 table :
TABLE1:
ID Rank Event
123456 1 178
123456 2
123456 3
123456 4 155
123456 5
123456 6 192
123456 7
356589 1 165
356589 2
356589 3
356589 4 166
565984 1 1025
565984 2
987456 1 85
987456 2
987456 3
987456 4 22
987456 5
987456 6
正在尝试根据先前的值填充 'Event' 列(如 Excel 中的 Ctrl+D)
TABLE2:
ID Rank Event
123456 1 178
123456 2 178
123456 3 178
123456 4 155
123456 5 155
123456 6 192
123456 7 192
356589 1 165
356589 2 165
356589 3 165
356589 4 166
565984 1 1025
565984 2 1025
987456 1 85
987456 2 85
987456 3 85
987456 4 22
987456 5 22
987456 6 22
问题是事件不遵循顺序并且计数(ID,排名)也不是常量。 我不能尝试使用基于变量的函数,因为它有数百万条记录,也不能使用 'update' 作为它的 Hawq。
有什么建议吗? 欣赏!
您可以尝试将 max
与 window 函数一起使用
CREATE TABLE T(ID int, Rank int, Event varchar(50));
INSERT INTO T VALUES (123456, 1,'178');
INSERT INTO T VALUES (123456, 2,'');
INSERT INTO T VALUES (123456, 3,'');
INSERT INTO T VALUES (123456, 4,'');
INSERT INTO T VALUES (123456, 5,'');
INSERT INTO T VALUES (123456, 6,'');
INSERT INTO T VALUES (123456, 7,'');
INSERT INTO T VALUES (356589, 1,'165');
INSERT INTO T VALUES (356589, 2,'');
INSERT INTO T VALUES (356589, 3,'');
INSERT INTO T VALUES (356589, 4,'');
INSERT INTO T VALUES (565984, 1,'1025');
INSERT INTO T VALUES (565984, 2,'');
INSERT INTO T VALUES (987456, 1,'85');
INSERT INTO T VALUES (987456, 2,'');
INSERT INTO T VALUES (987456, 3,'');
INSERT INTO T VALUES (987456, 4,'');
INSERT INTO T VALUES (987456, 5,'');
INSERT INTO T VALUES (987456, 6,'');
查询 1:
SELECT t.id,t.rank,max(Event) over (partition by ID order by Rank)
FROM T
| id | rank | max |
|--------|------|------|
| 123456 | 1 | 178 |
| 123456 | 2 | 178 |
| 123456 | 3 | 178 |
| 123456 | 4 | 178 |
| 123456 | 5 | 178 |
| 123456 | 6 | 178 |
| 123456 | 7 | 178 |
| 356589 | 1 | 165 |
| 356589 | 2 | 165 |
| 356589 | 3 | 165 |
| 356589 | 4 | 165 |
| 565984 | 1 | 1025 |
| 565984 | 2 | 1025 |
| 987456 | 1 | 85 |
| 987456 | 2 | 85 |
| 987456 | 3 | 85 |
| 987456 | 4 | 85 |
| 987456 | 5 | 85 |
| 987456 | 6 | 85 |
您可以使用 FIRST_VALUE
:
SELECT ID, RANK,
FIRST_VALUE(Event) OVER(PARTITION BY ID ORDER BY Rank) AS Event
FROM tab;
编辑:
Apologies! Each ID has multiple Event codes.
您可以通过额外的分组来处理它:
WITH cte AS (
SELECT ID, RANK, EVENT,
SUM(CASE WHEN event IS NULL THEN 0 ELSE 1 END)
OVER(PARTITION BY ID ORDER BY RANK) AS grp
FROM t
)
SELECT ID, RANK,
FIRST_VALUE(Event) OVER(PARTITION BY ID, grp ORDER BY Rank) AS Event
FROM cte;
无需使用数组操作的子查询也可以做到这一点:
select v.*,
(array_remove(array_agg(event) over (partition by id order by rank), NULL))[count(event) over (partition by id order by rank)]
from (values (1, 'a', 15),
(2, 'a', null),
(3, 'a', null),
(4, 'a', 20),
(5, 'a', null),
(1, 'b', 4),
(2, 'a', null)
) v(rank, id, event)
order by id, rank