在 BigQuery 中的 GROUP BY 之后在 ARRAY_AGG 内排序

Ordering within ARRAY_AGG after GROUP BY in BigQuery

我有一个 BigQuery table:

create or replace table `project.table.mock` as (
  select 1 as col0, 'a' as col1, 'x' as col2
    union all
  select 2 as col0, 'a' as col1, 'y' as col2
    union all
  select 4 as col0, 'b' as col1, 'z' as col2
    union all
  select 8 as col0, 'b' as col1, 'X' as col2
    union all
  select 7 as col0, 'b' as col1, 'Y' as col2
)

可视化:

我想要 group bycol1array_agg 来自 col2 的结果。我想让每个数组中出现的元素按 col0.

排序

我现在在:

select array_agg(col2) as col1arrays from `project.table.mock` group by col1;

这给了我:

第二行中的所需输出为 [z, Y, X](因为 z 出现在 col2 中的行在 col0 中有 4 个,Y 出现在 col2 中有 7 个出现在 col0X 出现在 col2 中的行在 col0 中有 8 个,并且 4 < 7 < 8。

如何在 BigQuery 中实现 array_agg 内的排序?

您可以在 ARRAY_AGG() 函数中添加 ORDER BY 子句。

SELECT ARRAY_AGG(col2 ORDER BY col1 ASC) AS col1arrays 
  FROM `project.table.mock`  
 GROUP BY col1;
WITH mock as (
  select 1 as col0, 'a' as col1, 'x' as col2
    union all
  select 2 as col0, 'a' as col1, 'y' as col2
    union all
  select 4 as col0, 'b' as col1, 'z' as col2
    union all
  select 8 as col0, 'b' as col1, 'X' as col2
    union all
  select 7 as col0, 'b' as col1, 'Y' as col2
)
select array_agg(col2 ORDER BY col0) as col1arrays from mock group by col1;

output:
+------------+
| col1arrays |
+------------+
| [x,y]      |
| [z,Y,X]    |
+------------+