如何在pgsql中限制组内的项目
How to limit items inside a group in pgsql
我需要 select 按 external_id
和 resolution
分组并按 timestamp
排序的数据,但 仅限于每个组中的前两个 ID。后面的我不知道怎么做。
我试着用简单的查询来做一些事情:
SELECT external_id, resolution, string_agg(id::text, ',') FROM some_table GROUP BY external_id, resolution ORDER BY timestamp LIMIT 2
但这还不够 - 该限制适用于整个查询。
来源
id
external_id
resolution
timestamp
1
1
1D
1645941482
2
1
1D
1645941481
3
1
1D
1645941484
4
2
1D
1645941483
5
2
1D
1645941463
6
3
1D
1645941183
7
3
1D
1645941483
8
3
1D
1646941483
8
3
1D
1645741488
10
3
1D
1645941490
11
1
3D
1645941494
12
1
3D
1645941491
13
2
3D
1645941496
14
2
3D
1645941490
15
2
3D
1645941493
16
2
3D
1645941491
17
3
3D
1645941492
预期结果
external_id
resolution
ids
1
1D
1,2
1
3D
11,12
2
1D
4,5
2
3D
13,14
3
1D
6,7
3
3D
17
您可以将 row_number()
与 cte 一起使用到 select N
来自任何组的行数,然后您可以在该结果集上使用 string_agg()
。
架构和插入语句:
create table source(id int, external_id int, resolution varchar(10), timestamp2 timestamp);
insert into source values(1, 1 ,'1D', to_timestamp(1645941482));
insert into source values(2, 1 ,'1D', to_timestamp(1645941481));
insert into source values(3, 1 ,'1D', to_timestamp(1645941484));
insert into source values(4, 2 ,'1D', to_timestamp(1645941483));
insert into source values(5, 2 ,'1D', to_timestamp(1645941463));
insert into source values(6, 3 ,'1D', to_timestamp(1645941183));
insert into source values(7, 3 ,'1D', to_timestamp(1645941483));
insert into source values(8, 3 ,'1D', to_timestamp(1646941483));
insert into source values(8, 3 ,'1D', to_timestamp(1645741488));
insert into source values(10, 3 ,'1D', to_timestamp(1645941490));
insert into source values(11, 1 ,'3D', to_timestamp(1645941494));
insert into source values(12, 1 ,'3D', to_timestamp(1645941491));
insert into source values(13, 2 ,'3D', to_timestamp(1645941496));
insert into source values(14, 2 ,'3D', to_timestamp(1645941490));
insert into source values(15, 2 ,'3D', to_timestamp(1645941493));
insert into source values(16, 2 ,'3D', to_timestamp(1645941491));
insert into source values(17, 3 ,'3D', to_timestamp(1645941492));
查询:
with cte as
(
select id,external_id, resolution, row_number()over(partition by external_id,resolution order by timestamp2)rn from source
)
SELECT external_id, resolution, string_agg(id::text, ',') ids
FROM cte
where rn<=2
GROUP BY external_id, resolution
输出:
external_id
resolution
ids
1
1D
2,1
1
3D
12,11
2
1D
5,4
2
3D
14,16
3
1D
8,6
3
3D
17
db<>fiddle here
我需要 select 按 external_id
和 resolution
分组并按 timestamp
排序的数据,但 仅限于每个组中的前两个 ID。后面的我不知道怎么做。
我试着用简单的查询来做一些事情:
SELECT external_id, resolution, string_agg(id::text, ',') FROM some_table GROUP BY external_id, resolution ORDER BY timestamp LIMIT 2
但这还不够 - 该限制适用于整个查询。
来源
id | external_id | resolution | timestamp |
---|---|---|---|
1 | 1 | 1D | 1645941482 |
2 | 1 | 1D | 1645941481 |
3 | 1 | 1D | 1645941484 |
4 | 2 | 1D | 1645941483 |
5 | 2 | 1D | 1645941463 |
6 | 3 | 1D | 1645941183 |
7 | 3 | 1D | 1645941483 |
8 | 3 | 1D | 1646941483 |
8 | 3 | 1D | 1645741488 |
10 | 3 | 1D | 1645941490 |
11 | 1 | 3D | 1645941494 |
12 | 1 | 3D | 1645941491 |
13 | 2 | 3D | 1645941496 |
14 | 2 | 3D | 1645941490 |
15 | 2 | 3D | 1645941493 |
16 | 2 | 3D | 1645941491 |
17 | 3 | 3D | 1645941492 |
预期结果
external_id | resolution | ids |
---|---|---|
1 | 1D | 1,2 |
1 | 3D | 11,12 |
2 | 1D | 4,5 |
2 | 3D | 13,14 |
3 | 1D | 6,7 |
3 | 3D | 17 |
您可以将 row_number()
与 cte 一起使用到 select N
来自任何组的行数,然后您可以在该结果集上使用 string_agg()
。
架构和插入语句:
create table source(id int, external_id int, resolution varchar(10), timestamp2 timestamp);
insert into source values(1, 1 ,'1D', to_timestamp(1645941482));
insert into source values(2, 1 ,'1D', to_timestamp(1645941481));
insert into source values(3, 1 ,'1D', to_timestamp(1645941484));
insert into source values(4, 2 ,'1D', to_timestamp(1645941483));
insert into source values(5, 2 ,'1D', to_timestamp(1645941463));
insert into source values(6, 3 ,'1D', to_timestamp(1645941183));
insert into source values(7, 3 ,'1D', to_timestamp(1645941483));
insert into source values(8, 3 ,'1D', to_timestamp(1646941483));
insert into source values(8, 3 ,'1D', to_timestamp(1645741488));
insert into source values(10, 3 ,'1D', to_timestamp(1645941490));
insert into source values(11, 1 ,'3D', to_timestamp(1645941494));
insert into source values(12, 1 ,'3D', to_timestamp(1645941491));
insert into source values(13, 2 ,'3D', to_timestamp(1645941496));
insert into source values(14, 2 ,'3D', to_timestamp(1645941490));
insert into source values(15, 2 ,'3D', to_timestamp(1645941493));
insert into source values(16, 2 ,'3D', to_timestamp(1645941491));
insert into source values(17, 3 ,'3D', to_timestamp(1645941492));
查询:
with cte as
(
select id,external_id, resolution, row_number()over(partition by external_id,resolution order by timestamp2)rn from source
)
SELECT external_id, resolution, string_agg(id::text, ',') ids
FROM cte
where rn<=2
GROUP BY external_id, resolution
输出:
external_id | resolution | ids |
---|---|---|
1 | 1D | 2,1 |
1 | 3D | 12,11 |
2 | 1D | 5,4 |
2 | 3D | 14,16 |
3 | 1D | 8,6 |
3 | 3D | 17 |
db<>fiddle here