如何将嵌套的 json key/value 对展平为单个值数组?
How do I flatten a nested json key/value pair into a single array of values?
在 SNOWFLAKE 中,我的数据结构如下:
ORGANIZATION TABLE
------------------
Org:variant
------------------
{
relationships: [{
{ name: 'mother', value: a },
{ name: 'siblings', value: [ 'c', 'd' ] }
}]
}
PEOPLE TABLE
-------------------
Person:variant
-------------------
{
id: a
name: Mary
}
-------------------
{
id: b
name: Joe
}
-------------------
{
id: c
name: John
}
我想要的结果是:
ORGANIZATION | PEOPLE
---------------------------------------------------|----------------------------
{ |[
relationships: [{ | {
{ name: 'mother', value: a }, | id: a,
{ name: 'siblings', value: [ 'c', 'd' ] } | name: Mary
}] | },
} | {
| id: b,
| name: Joe
| },
| {
| id: c,
| name: john
| }
|]
我确定 ARRAY_AGG 以某种方式参与其中,但我不知道如何将结果聚合到一个值数组中。
我当前的查询:
SELECT Org, ARRAY_AGG(Person) as People
FROM Organizations
INNER JOIN People ON People.id IN Org.relationships...?? (I'm lost here)
GROUP BY Org
以下查询说明了如何使用 FLATTEN 和 ARRAY_AGG 获得所需的输出。
- FLATTEN 取消嵌套每个数组,以便您可以连接其中的值。
- ARRAY_AGG 聚合按组织分组的值。
- CASE 语句解释了 org.relationships 并不总是一个数组。
CREATE OR REPLACE TABLE organizations (org variant) AS
SELECT parse_json('{relationships: [{ name: "mother", value: "a" }, { name: "siblings", value: [ "b", "c" ] } ] } ');
CREATE OR REPLACE TABLE people (person variant) AS
SELECT parse_json()
FROM
VALUES ('{id:"a", name: "Mary"}'),
('{id:"b", name: "Joe"}'),
('{id:"c", name: "John"}');
WITH org_people AS
(SELECT o.org,
relationship.value AS relationship,
CASE is_array(relationship:value)
WHEN TRUE THEN person_in_relationship.value
ELSE relationship:value
END AS person_in_relationship
FROM organizations o,
LATERAL FLATTEN(o.org:relationships) relationship ,
LATERAL FLATTEN(relationship.value:value, OUTER=>TRUE) person_in_relationship
)
SELECT op.org,
ARRAY_AGG(p.person) AS people
FROM org_people op
JOIN people p ON p.person:id = op.person_in_relationship
GROUP BY op.org;
在 SNOWFLAKE 中,我的数据结构如下:
ORGANIZATION TABLE
------------------
Org:variant
------------------
{
relationships: [{
{ name: 'mother', value: a },
{ name: 'siblings', value: [ 'c', 'd' ] }
}]
}
PEOPLE TABLE
-------------------
Person:variant
-------------------
{
id: a
name: Mary
}
-------------------
{
id: b
name: Joe
}
-------------------
{
id: c
name: John
}
我想要的结果是:
ORGANIZATION | PEOPLE
---------------------------------------------------|----------------------------
{ |[
relationships: [{ | {
{ name: 'mother', value: a }, | id: a,
{ name: 'siblings', value: [ 'c', 'd' ] } | name: Mary
}] | },
} | {
| id: b,
| name: Joe
| },
| {
| id: c,
| name: john
| }
|]
我确定 ARRAY_AGG 以某种方式参与其中,但我不知道如何将结果聚合到一个值数组中。
我当前的查询:
SELECT Org, ARRAY_AGG(Person) as People
FROM Organizations
INNER JOIN People ON People.id IN Org.relationships...?? (I'm lost here)
GROUP BY Org
以下查询说明了如何使用 FLATTEN 和 ARRAY_AGG 获得所需的输出。
- FLATTEN 取消嵌套每个数组,以便您可以连接其中的值。
- ARRAY_AGG 聚合按组织分组的值。
- CASE 语句解释了 org.relationships 并不总是一个数组。
CREATE OR REPLACE TABLE organizations (org variant) AS
SELECT parse_json('{relationships: [{ name: "mother", value: "a" }, { name: "siblings", value: [ "b", "c" ] } ] } ');
CREATE OR REPLACE TABLE people (person variant) AS
SELECT parse_json()
FROM
VALUES ('{id:"a", name: "Mary"}'),
('{id:"b", name: "Joe"}'),
('{id:"c", name: "John"}');
WITH org_people AS
(SELECT o.org,
relationship.value AS relationship,
CASE is_array(relationship:value)
WHEN TRUE THEN person_in_relationship.value
ELSE relationship:value
END AS person_in_relationship
FROM organizations o,
LATERAL FLATTEN(o.org:relationships) relationship ,
LATERAL FLATTEN(relationship.value:value, OUTER=>TRUE) person_in_relationship
)
SELECT op.org,
ARRAY_AGG(p.person) AS people
FROM org_people op
JOIN people p ON p.person:id = op.person_in_relationship
GROUP BY op.org;