如何将雪花 table 转换为不同的结构
how to convert snowflake table to different structure
id
name
DESCRIPTION
ACTIVE
UPDATED_JSON
id1
name-1
desc-1
true
{"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-1","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"}
id2
name-2
desc-2
true
{"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-2","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"}
我在snowflake中有一个table的上述结构。 UPDATED_JSON
是变体列。我想将此 table 更改为具有类似于下面的结构。
在 UPDATED_JSON
中,我有 fieldName
,当它的值为 name
时,我需要更新 name
列以包含 valueAfter
数据。 diffFields
未订购。如果 updated_json
中的 name
不存在,我想保留 name
列的当前值。
在下面的示例中,name-1
更改为 new-segment-name-1
因为 UPDATED_JSON
有一个 fieldName
的值为 name
和 valueAfter
的值为new-segment-name-1
id
name
DESCRIPTION
ACTIVE
id1
new-segment-name-1
desc-1
true
id2
new-segment-name-2
desc-2
true
我正在尝试使用 dbt
您的 CTE 数据:
WITH data(id, name, DESCRIPTION, ACTIVE, UPDATED_JSON) as (
select column1, column2, column3, column4, parse_json(column5) from values
('id1', 'name-1', 'desc-1', true,'{"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-1","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"}'),
('id2', 'name-2', 'desc-2', true, '{"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-2","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"}')
)
select id
,max(iff(f.value:fieldName::text = 'name', f.value:valueAfter::text, null)) as name
,DESCRIPTION
,active
from data, table(flatten(input=>UPDATED_JSON:diffFields)) f
group by 1,3,4;
给出:
ID
NAME
DESCRIPTION
ACTIVE
id2
new-segment-name-2
desc-2
TRUE
id1
new-segment-name-1
desc-1
TRUE
id | name | DESCRIPTION | ACTIVE | UPDATED_JSON |
---|---|---|---|---|
id1 | name-1 | desc-1 | true | {"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-1","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"} |
id2 | name-2 | desc-2 | true | {"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-2","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"} |
我在snowflake中有一个table的上述结构。 UPDATED_JSON
是变体列。我想将此 table 更改为具有类似于下面的结构。
在 UPDATED_JSON
中,我有 fieldName
,当它的值为 name
时,我需要更新 name
列以包含 valueAfter
数据。 diffFields
未订购。如果 updated_json
中的 name
不存在,我想保留 name
列的当前值。
在下面的示例中,name-1
更改为 new-segment-name-1
因为 UPDATED_JSON
有一个 fieldName
的值为 name
和 valueAfter
的值为new-segment-name-1
id | name | DESCRIPTION | ACTIVE |
---|---|---|---|
id1 | new-segment-name-1 | desc-1 | true |
id2 | new-segment-name-2 | desc-2 | true |
我正在尝试使用 dbt
您的 CTE 数据:
WITH data(id, name, DESCRIPTION, ACTIVE, UPDATED_JSON) as (
select column1, column2, column3, column4, parse_json(column5) from values
('id1', 'name-1', 'desc-1', true,'{"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-1","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"}'),
('id2', 'name-2', 'desc-2', true, '{"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-2","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"}')
)
select id
,max(iff(f.value:fieldName::text = 'name', f.value:valueAfter::text, null)) as name
,DESCRIPTION
,active
from data, table(flatten(input=>UPDATED_JSON:diffFields)) f
group by 1,3,4;
给出:
ID | NAME | DESCRIPTION | ACTIVE |
---|---|---|---|
id2 | new-segment-name-2 | desc-2 | TRUE |
id1 | new-segment-name-1 | desc-1 | TRUE |