Big Query (SQL) 将多列转换为行/数组
Big Query (SQL) convert multiple columns to rows / array
我有一个包含多个类似列的数据源,如下所示,每个问题作为一个新列和相应的响应:
Original
我想将它转换为使用具有两个成对列的数组,这样它最终看起来像这样,只有两列,问题和响应,每个遗留列都有自己的键(1、2、3 等):
Desired
请耐心等待,我相信这很简单,我认为需要使用 array_agg 甚至可能是一个逆轴,但我已经搜索了 post 并且不能从“平面”源中找到与多列名称相关的问题列的值的类似解决方案/根据原始列名称在新字段中分配值。
我有这个,但我需要得到 Question/Response 配对....
select ID, array_agg(response ignore nulls) Questionnaire
from datasourcename,
unnest([Q1Response, Q2Response, ]) response
group by ID
非常感谢任何支持(第一个post!)
试试这个:
with mytable as (
select 1 as id, 'a' as q1response, 'c' as q2response, 'a' as q3response, 'd' as q4response union all
select 2, 'b', 'a', 'a', 'd' union all
select 3, 'a', 'b', 'b', 'a'
)
select
id,
[ struct('1' as question, q1response as response),
struct('2' as question, q2response as response),
struct('3' as question, q3response as response),
struct('4' as question, q4response as response)
] as q
from mytable
考虑以下解决方案 - 它适用于 table w/o 中任意数量的 questions/columns 代码中的任何更改
select id,
array(
select as struct regexp_extract(kv[offset(0)], r'\d+') as Question,
kv[offset(1)] as Response
from unnest(regexp_extract_all(to_json_string(t), r',("[^"]+":"[^"]*")')) kvs,
unnest([struct(split(trim(kvs, '"'), '":"') as kv)])
) Questionnaire
from `project.dataset.table` t
如果应用于您问题中的示例数据 - 输出为
我有一个包含多个类似列的数据源,如下所示,每个问题作为一个新列和相应的响应: Original
我想将它转换为使用具有两个成对列的数组,这样它最终看起来像这样,只有两列,问题和响应,每个遗留列都有自己的键(1、2、3 等): Desired
请耐心等待,我相信这很简单,我认为需要使用 array_agg 甚至可能是一个逆轴,但我已经搜索了 post 并且不能从“平面”源中找到与多列名称相关的问题列的值的类似解决方案/根据原始列名称在新字段中分配值。
我有这个,但我需要得到 Question/Response 配对....
select ID, array_agg(response ignore nulls) Questionnaire
from datasourcename,
unnest([Q1Response, Q2Response, ]) response
group by ID
非常感谢任何支持(第一个post!)
试试这个:
with mytable as (
select 1 as id, 'a' as q1response, 'c' as q2response, 'a' as q3response, 'd' as q4response union all
select 2, 'b', 'a', 'a', 'd' union all
select 3, 'a', 'b', 'b', 'a'
)
select
id,
[ struct('1' as question, q1response as response),
struct('2' as question, q2response as response),
struct('3' as question, q3response as response),
struct('4' as question, q4response as response)
] as q
from mytable
考虑以下解决方案 - 它适用于 table w/o 中任意数量的 questions/columns 代码中的任何更改
select id,
array(
select as struct regexp_extract(kv[offset(0)], r'\d+') as Question,
kv[offset(1)] as Response
from unnest(regexp_extract_all(to_json_string(t), r',("[^"]+":"[^"]*")')) kvs,
unnest([struct(split(trim(kvs, '"'), '":"') as kv)])
) Questionnaire
from `project.dataset.table` t
如果应用于您问题中的示例数据 - 输出为