在 PostgreSQL 上以正确的顺序对数组元素进行分组

Grouping array elements in the correct order on PostgreSQL

是否可以在 PostgreSQL 中对数组元素进行分组?

例如,我有 2 个这样的相关数组(我说相关是因为第一个数组表示操作,第二个数组表示这些操作的次数:

col0 = 'any_value'
col1 = array1['a','b','b','c','c','a','a','a','c']
col2 = array2[1,2,3,4,5,6,7,8,9]

我想输出以下结果:

col0 = 'any_value'
array_result1['a','b','c','a','c']    
array_result2[1,2,4,6,9]

取消数组嵌套的一种方法是使用序数,这是一个示例查询,但它return是对数组元素的不同选择,它删除了重复的元素:

select col0, 
       array_agg(x order by rn) as unique_array1
        from (
              select 
              distinct on (col0, a.x) col0, 
                           a.x, 
                           a.rn
              from table_a, 
                   unnest(array1) with ordinality as a (x,rn)
              order by 1,2,3  
             ) unnested_ordered
group by col0;

所以结果是:

col0 = 'any_value'
array_result1['a','b','c']    

但是如您所见,它缺少许多元素。

编辑:

为了更详细地描述我的问题,最后我想知道每个 array_result1 操作最初是什么时候完成的。 所以对于示例结果

array_result1['a','b','c','a','c']    
*array_result2[1,2,4,6,9]

*我假设数组的位置从 1 而不是 0 开始,我也固定了最后一个元素,它应该是 9 而不是 7

会帮助我知道,第一个动作 'a' 何时发生以及第二个动作 'a' 何时发生,这样我就可以计算出从 'a' 到 return 进入我正在构建的路径。 所以第一次发生的动作 'a' 是 = 1 第二次是 = 6

所以动作'a'在路径(数组)中出现了两次,需要5个时间单位才能重新出现。这就是为什么我需要第二个数组,其中包含操作发生的时间(每个操作第一次发生的时间)

您可以使用 LATERAL 并使用 ROW_NUMBER:

计算组
DROP TABLE IF EXISTS table_a;
CREATE TABLE table_a(col0 VARCHAR(10), col1 text[],col2 int[]);

INSERT INTO table_a(col0, col1, col2)
VALUES ('any_value',array['a','b','b','c','c','a','a','a','c'],
        array[1,2,3,4,5,6,7,8,9]);

主查询:

SELECT col0,
       col1,
       unique_col1
FROM table_a,
LATERAL (SELECT ARRAY_AGG(x ORDER BY grp) AS unique_col1
         FROM ( SELECT DISTINCT x,
                 rn - ROW_NUMBER() OVER(PARTITION BY x ORDER BY rn) AS grp
               FROM unnest(col1) WITH ORDINALITY AS a(x,rn)
         ) AS sub      
) AS lat1

输出:

编辑:

正在计算第二个数组:

SELECT col0,
       col1,
       unique_col1,
       col2,
       unique_col2
FROM table_a,
LATERAL (SELECT ARRAY_AGG(x ORDER BY grp) AS unique_col1
         FROM ( SELECT DISTINCT x,
                 rn - ROW_NUMBER() OVER(PARTITION BY x ORDER BY rn) AS grp
               FROM unnest(col1) WITH ORDINALITY AS a(x,rn)
         ) AS sub      
) AS lat1,
LATERAL (
   SELECT array_agg(x ORDER BY rn) AS unique_col2
   FROM unnest(col2) WITH ORDINALITY AS b(x,rn)
   WHERE rn IN (
         SELECT SUM(c) OVER(ORDER BY grp) - (c-1) AS result
         FROM (SELECT grp,  COUNT(*) AS c
               FROM ( SELECT x,
                             rn - ROW_NUMBER() OVER(PARTITION BY x ORDER BY rn)  AS grp
                      FROM unnest(col1) WITH ORDINALITY AS a(x,rn)
               ) AS sub     
          GROUP BY grp) AS s
    )      
) AS lat2

备注:

它从值而不是它的位置生成第二个数组,所以当你有:

col2 = array[9,8,7,6,5,4,3,2,1]

你将获得:

[9,8,6,4,1]

如果您只想要职位,您可以使用:

...
LATERAL (
   SELECT array_agg(result ORDER BY result) AS unique_col2
   FROM (
         SELECT SUM(c) OVER(ORDER BY grp) - (c-1) AS result
         FROM (SELECT grp,  COUNT(*) AS c
               FROM ( SELECT x,
                             rn - ROW_NUMBER() OVER(PARTITION BY x ORDER BY rn) AS grp
                      FROM unnest(col1) WITH ORDINALITY AS a(x,rn)
               ) AS sub     
          GROUP BY grp) AS s
    ) AS s1    
) AS lat2

结果将是:

[1,2,4,6,9]

编辑 2

以上版本有小错误。 ARRAY_AGG 应按 rn 而非 grp:

排序
DROP TABLE IF EXISTS table_a;
CREATE TABLE table_a(col0 VARCHAR(10), col1 text[],col2 int[]);

INSERT INTO table_a(col0, col1, col2)
VALUES ('any_value',array['a','b','b','c','c','a','a','a','c'],
        array[1,2,3,4,5,6,7,8,9]);

INSERT INTO table_a(col0, col1, col2)
VALUES ('any_value2',array['a','b','a','a','c','a'],array[1,2,3,4,5,6]);        


SELECT *
FROM table_a,
LATERAL (SELECT ARRAY_AGG(x ORDER BY rn) AS unique_col1
         FROM
           (SELECT x, grp, MIN(rn) AS rn
            FROM (SELECT  x,
                       rn - ROW_NUMBER() OVER(PARTITION BY x ORDER BY rn) AS grp,
                       rn
                  FROM unnest(col1) WITH ORDINALITY AS a(x,rn)
           ) AS sub
         GROUP BY x, grp) AS s      
        ) AS lat1;