SQL：查询按最大大小的数据包对文件进行分组

Question

我遇到了一个问题，在第一种方法中可能不是很困难，但它并不像它应该的那么容易。我有一个文件列表，它们的大小以兆字节为单位，这是一个来自数据库 table.

的列表

示例：

SELECT file_name, file_size
FROM myfiles

结果：

file_name | file_size    
----------------------
file 1    | 14
file 2    | 5
file 3    | 20
file 4    | 6

我想实现一个 SQL 查询，将文件按数据包分组发送到 API。文件应按最大 20 Mb 的数据包重新分组。最好的解决方案是 return 第三列中的“发送索引”。像这样：

file_name | file_size | packet_index   
-------------------------------------
file 1    | 14        | 1 
file 2    | 5         | 1
file 3    | 20        | 2
file 4    | 6         | 3

第一次发送将包含文件 1 和 2，第二个发送文件 3，最后一个发送文件 4。请问如何在 SQL 中确定此信息？

Answer 1

不幸的是，这类问题需要递归 CTE。您正在按顺序分配组，因此：

with tt as (
      select t.*, row_number() over (order by file_name) as seqnum
      from myfiles t
     ),
     cte as (
      select file_name, file_size, file_size as total, seqnum, 1 as grp
      from tt
      where tt.seqnum = 1
      union all
      select tt.filename, tt.filesize,
             (case when tt.filesize + cte.total > 20
                   then tt.filesize
                   else tt.filesize + cte.total 
              end),
             tt.seqnum,
             (case when tt.filesize + cte.total > 20
                   then cte.grp + 1
                   else cte.grp
              end)
      from cte join
           tt
           on tt.seqnum = cte.seqnum + 1
    )
select *
from cte;

Here 是一个 db<>fiddle.

如果超过 100 行，则添加 OPTION (MAXRECURSION 0)。

SQL：查询按最大大小的数据包对文件进行分组

TSQL: Query to group files by packet of a maximum size

sql

tsql

sql-server

group-by

packet