将 SQL 结果分成最大大小 = n 的组
Splitting SQL result into groups with max size = n
我有一个table
id | volume_id| ... |
----+----------+-----+
1 | 1 | ... |
2 | 2 | ... |
3 | 1 | ... |
4 | 3 | ... |
5 | 2 | ... |
...
我可以做一个简单的分组查询:
select volume_id, count(*), min(id) as min_id, max(id) as max_id
from my_table
group by volume_id;
这将产生结果:
volume_id | count | min_id | max_id
-----------+-------+--------+--------
1 | 67330 | ... | ...
2 | 67330 | ... | ...
3 | 67330 | ... | ...
4 | 67330 | ... | ...
但我想将结果分成 40K 行的组。所以结果应该是这样的:
volume_id | count | min_id | max_id
-----------+-------+--------+--------
1 | 40000 | ... | ... <- first group of IDs for volume 1
1 | 27330 | ... | ... <- second group of IDs for volume 1
2 | 40000 | ... | ...
2 | 27330 | ... | ...
3 | 40000 | ... | ...
4 | 27330 | ... | ...
ID 应拆分,以便第一组的 max_id
应小于第二组的 min_id
,依此类推。
如果有人知道如何编写这样的查询(或 plsql 函数,如果没有其他方法),我将不胜感激。
我正在使用 Postgresql 9.5。
您可以使用 rank()
(如果没有重复则使用 row_number()
)来枚举组。然后简单算术在group by
:
select volume_id, count(*), min(id) as min_id, max(id) as max_id
from (select t.*,
rank() over (partition by volume_id order by id) as seqnum
from my_table t
) t
group by volume_id, floor((seqnum - 1) / 40000)
order by volume_id, min(id);
我有一个table
id | volume_id| ... |
----+----------+-----+
1 | 1 | ... |
2 | 2 | ... |
3 | 1 | ... |
4 | 3 | ... |
5 | 2 | ... |
...
我可以做一个简单的分组查询:
select volume_id, count(*), min(id) as min_id, max(id) as max_id
from my_table
group by volume_id;
这将产生结果:
volume_id | count | min_id | max_id
-----------+-------+--------+--------
1 | 67330 | ... | ...
2 | 67330 | ... | ...
3 | 67330 | ... | ...
4 | 67330 | ... | ...
但我想将结果分成 40K 行的组。所以结果应该是这样的:
volume_id | count | min_id | max_id
-----------+-------+--------+--------
1 | 40000 | ... | ... <- first group of IDs for volume 1
1 | 27330 | ... | ... <- second group of IDs for volume 1
2 | 40000 | ... | ...
2 | 27330 | ... | ...
3 | 40000 | ... | ...
4 | 27330 | ... | ...
ID 应拆分,以便第一组的 max_id
应小于第二组的 min_id
,依此类推。
如果有人知道如何编写这样的查询(或 plsql 函数,如果没有其他方法),我将不胜感激。
我正在使用 Postgresql 9.5。
您可以使用 rank()
(如果没有重复则使用 row_number()
)来枚举组。然后简单算术在group by
:
select volume_id, count(*), min(id) as min_id, max(id) as max_id
from (select t.*,
rank() over (partition by volume_id order by id) as seqnum
from my_table t
) t
group by volume_id, floor((seqnum - 1) / 40000)
order by volume_id, min(id);