从 Redshift 中的 json 数组中提取所有 "name" 值

Extract all "name" values from json array in Redshift

我有一个 json 字段,其值为:

[{"elementId": "1", "name": "foo", "value": "A"}, {"elementId": "2", "name": "bar", "value": "B"}, {"elementId": "3", "name": "foobar", "value": "C"}, {"elementId": "4", "name": "barfoo", "value": "D"}]

所以我的数据集看起来像:

user_id | form_data
---------------------------------------------
101 | [{"elementId": "1", "name": "foo", "value": "A"}, {"elementId": "2", "name": "bar", "value": "B"}, {"elementId": "3", "name": "foobar", "value": "C"}, {"elementId": "4", "name": "barfoo", "value": "D"}]
102 | [{"elementId": "1", "name": "crash", "value": "A"}, {"elementId": "2", "name": "bang", "value": "B"}, {"elementId": "3", "name": "wallop", "value": "C"}]

我想从列表中提取“名称”值,这样我就有一个逗号分隔的列表列作为输出:

user_id | names
----------------------------
101 | foo,bar,foobar,barfoo
102 | crash,bang,wallop

列表可以有不同的长度,所以我目前的方法(如下)不适用于更长的表格:

SELECT a || ',' || b || ',' || c || ',' || d
FROM (select f.form_data
      ,JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (f.form_data,0),'name') a
      ,JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (f.form_data,1),'name') b
      ,JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (f.form_data,2),'name') c
      ,JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (f.form_data,3),'name') d
      FROM forms f)

如有任何帮助,我们将不胜感激!

我可以使用上面的方法将所有名称值提取为单独的列 - 但这里的问题是我必须引用“名称”可能出现的每个索引位置 - 这不会缩放。

我想要实现的是将所有“名称”组件提取到一个单元格中,而不引用数组中的位置

这看起来是递归 CTE 的用例。看看这个例子,因为它似乎几乎是你想要的 - https://docs.aws.amazon.com/redshift/latest/dg/r_WITH_clause.html

根据这次更新,我认为您只是想扩展 json 数组,然后将名称聚合到一个列表中。这符合您的需求吗?

with num1024 as (
SELECT 
    (p0.n + p1.n*2 + p2.n * POWER(2,2) + p3.n * POWER(2,3) + p4.n * POWER(2,4) + p5.n * POWER(2,5) 
        + p6.n * POWER(2,6) + p7.n * POWER(2,7) + p8.n * POWER(2,8) + p9.n * POWER(2,9))::int as n
  FROM 
    (SELECT 0 as n UNION SELECT 1) p0,
    (SELECT 0 as n UNION SELECT 1) p1,
    (SELECT 0 as n UNION SELECT 1) p2,
    (SELECT 0 as n UNION SELECT 1) p3,
    (SELECT 0 as n UNION SELECT 1) p4,
    (SELECT 0 as n UNION SELECT 1) p5,
    (SELECT 0 as n UNION SELECT 1) p6,
    (SELECT 0 as n UNION SELECT 1) p7,
    (SELECT 0 as n UNION SELECT 1) p8,
    (SELECT 0 as n UNION SELECT 1) p9
  Order by 1
)
select id, listagg(a, ',') within group (order by depth) as names
from (
        select f.id
              ,JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (f.form_data,n.n),'name') a
              ,n.n as depth
        FROM forms f, num1024 n
        where a <> '' )
group by id
order by id
;

最多支持 1024 个数组元素。