ARRAY_AGG(STRUCT(x,y,z)) 等效于 Bigquery 遗留 SQL

ARRAY_AGG(STRUCT(x,y,z)) equivalent in Bigquery legacy SQL

我有一个标准的SQL查询结构如下

SELECT a, ARRAY_AGG(STRUCT(x,y,z))
FROM t
GROUP BY a

如何在遗留 SQL 中编写相同的查询?

无法使用 Legacy SQL 嵌套非叶字段。唯一的解决方法是将 x、y、z 打包成一个字符串(例如构造 JSON),然后在其上使用 NEST,并且每当需要单个字段时,使用一些字符串解析函数或 Javascript UDF。不用说,使用 Standard SQL.

就简单多了

同时,如果您仍然需要在 BigQuery Legacy 中使用它 SQL - 请参阅下面的简单示例。

BigQuery 标准 SQL 版本

#standardSQL
WITH t AS (
  SELECT 1 AS a, 11 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 2 AS a, 21 AS x, 22 AS y, 23 AS z UNION ALL
  SELECT 3 AS a, 31 AS x, 32 AS y, 33 AS z
)
SELECT 
  a, ARRAY_AGG(STRUCT(x, y, z)) AS aa 
FROM t
GROUP BY a  

BigQuery Legacy SQL 版本(确保设置目标 table 并关闭展平结果 - 否则 UI 将展平输出)

#legacySQL
SELECT a, aa.*
FROM JS( 
  ( // input table 
  SELECT 
    a, GROUP_CONCAT(CONCAT(STRING(x), ';', STRING(y), ';', STRING(z))) AS aa 
  FROM 
  (SELECT 1 AS a, 11 AS x, 12 AS y, 13 AS z),
  (SELECT 2 AS a, 21 AS x, 22 AS y, 23 AS z),
  (SELECT 3 AS a, 31 AS x, 32 AS y, 33 AS z)
  GROUP BY a
  ), 
  a, aa, // input columns 
  "[ // output schema 
  {name: 'a', type:'integer'},
  {name: 'aa', type:'record', mode:'repeated', 
  fields: [
    {name: 'x', type: 'integer'},
    {name: 'y', type: 'integer'},
    {name: 'z', type: 'integer'}
    ]}
   ]", 
  "function(row, emit) { // function 
    var aa = []; 
    aa1 = row.aa.split(',');
    for (var i = 0; i < aa1.length; i++) { 
      aa2 = aa1[i].split(';');
      aa.push({x:parseInt(aa2[0]), y:parseInt(aa2[1]), z:parseInt(aa2[2])}); 
    }; 
    emit({
      a: row.a, 
      aa: aa
      }); 
  }"
)