一次聚合 k 行的所有组合

Aggregate all combinations of rows taken k at a time

我正在尝试为 table 中的行子集的字段计算聚合函数。问题是我想找到一次 k 行的每个组合的平均值 --- 所以对于所有行,我想找到(比如说)10 行的每个组合的平均值。所以:

 id | count
----|------
  1 |  5
  2 |  3
  3 |  6
...
 30 | 16

应该给我

ids 1..10 的平均值;编号 1、3..11; IDs 1、4..12 等等。我知道这会产生很多行。

找到 combinations from arrays. I could do this programmatically by taking 30 ids 10 at a time and then SELECTing them. Is there a way to do this with PARTITION BY, TABLESAMPLE, or another function (something like python's itertools.combinations()) 有 SO 答案? (TABLESAMPLE 本身并不能保证 我选择的是哪个 行子集。)

引用的答案中描述的方法是静态的。更方便的解决方案可能是使用递归。

示例数据:

drop table if exists my_table;
create table my_table(id int primary key, number int);
insert into my_table values
(1, 5), 
(2, 3), 
(3, 6), 
(4, 9), 
(5, 2);

在 5 个元素集中找到 2 个元素子集的查询(k = 2 的 k 组合):

with recursive recur as (
    select 
        id, 
        array[id] as combination, 
        array[number] as numbers, 
        number as sum
    from my_table
union all
    select 
        t.id, 
        combination || t.id, 
        numbers || t.number, 
        sum+ number
    from my_table t
    join recur r on r.id < t.id 
    and cardinality(combination) < 2            -- param k
)
select combination, numbers, sum/2.0 as average -- param k
from recur
where cardinality(combination) = 2              -- param k

 combination | numbers |      average       
-------------+---------+--------------------
 {1,2}       | {5,3}   | 4.0000000000000000
 {1,3}       | {5,6}   | 5.5000000000000000
 {1,4}       | {5,9}   | 7.0000000000000000
 {1,5}       | {5,2}   | 3.5000000000000000
 {2,3}       | {3,6}   | 4.5000000000000000
 {2,4}       | {3,9}   | 6.0000000000000000
 {2,5}       | {3,2}   | 2.5000000000000000
 {3,4}       | {6,9}   | 7.5000000000000000
 {3,5}       | {6,2}   | 4.0000000000000000
 {4,5}       | {9,2}   | 5.5000000000000000
(10 rows)   

对于 k = 3 的相同查询给出:

 combination | numbers |      average       
-------------+---------+--------------------
 {1,2,3}     | {5,3,6} | 4.6666666666666667
 {1,2,4}     | {5,3,9} | 5.6666666666666667
 {1,2,5}     | {5,3,2} | 3.3333333333333333
 {1,3,4}     | {5,6,9} | 6.6666666666666667
 {1,3,5}     | {5,6,2} | 4.3333333333333333
 {1,4,5}     | {5,9,2} | 5.3333333333333333
 {2,3,4}     | {3,6,9} | 6.0000000000000000
 {2,3,5}     | {3,6,2} | 3.6666666666666667
 {2,4,5}     | {3,9,2} | 4.6666666666666667
 {3,4,5}     | {6,9,2} | 5.6666666666666667
(10 rows)

当然,如果您不需要它们,您可以从查询中删除 numbers