重复记录到一个字符串,BigQuery

Repeated record to one string, BigQuery

我有一个包含查询作业的 table,有一个重复的记录列 referenced_tables。这条记录中有 3 列和几行,我想将整个记录转换为单个字符串,并使用列分隔符“。”和 ',' 代表行。记录中的列是项目、数据集和 table id 的(字符串),每一行都是不同的 table。我正在尝试这样的事情:

select array_to_string(
    [
        array_to_string(
            [
                referenced_tables[ordinal(1)].project_id, 
                referenced_tables[ordinal(1)].dataset_id, 
                referenced_tables[ordinal(1)].table_id
            ], "."
        ),
        array_to_string(
            [
                referenced_tables[ordinal(2)].project_id, 
                referenced_tables[ordinal(2)].dataset_id, 
                referenced_tables[ordinal(2)].table_id
            ], "."
        )
    ], ", "
)
FROM my_table

当我指定作业时,结果如下:project1.dataset1.table1,project2.dataset2.table2。所以它有效,但我必须重复 array_to_string 尽可能多的行数,当然行数会从一个作业更改为另一个作业,并且 referenced_tables可以为NULL,所以如果我这样整体table,就会出错

有没有办法根据我的条件将整个 table 的记录转换为字符串?

我对您的数据结构做了一些假设。鉴于您说这是通过查询进行的,因此有某种 query_id 与分组依据相关联。

尝试以下方法

select
  string_agg(concat(t.project_id,".",t.dataset_id,".",t.table_id),", ") 
from sample_data
, unnest(referenced_tables) t

使用以下示例数据:

with sample_data as (
    select 1 as query_id
        , [STRUCT('my_project' as project_id, 'my_dataset' as dataset_id, 'table' as table_id)
           ,STRUCT('my_project1' as project_id, 'my_datase1' as dataset_id, 'table1' as table_id)
           ,STRUCT('my_projec2' as project_id, 'my_dataset2' as dataset_id, 'table2' as table_id)
           ] as referenced_tables
    UNION ALL 
    select 2
        , [STRUCT('my_project3' as project_id, 'my_dataset3' as dataset_id, 'table3' as table_id)
           ,STRUCT('my_project4' as project_id, 'my_datase4' as dataset_id, 'table4' as table_id)
           ,STRUCT('my_projec5' as project_id, 'my_dataset5' as dataset_id, 'table5' as table_id)
           ] as referenced_tables
)

它产生