unnest() 未爆炸数组,returns 错误列别名列表有 1 个条目但 't' 有 2 列可用
unnest() not exploding array, returns error Column alias list has 1 entries but 't' has 2 columns available
我有一些 json 数据,其中包括 属性 'characters',它看起来像这样:
select json_data['characters'] from latest_snapshot_events
Returns: [{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":10,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":3},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":39,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":2},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":6801450488388220,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":1,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":8355588830097610,"shards":0,"CHAR_TPIECES":5,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4}]
这是 return 在一行中编辑的。我想要数组中的每个项目一行。
我发现一些 SO 帖子和其他博客建议我使用 unnest()
。我已经尝试过几次,但无法得到 return 的结果。例如,here is the documentation from presto。底部覆盖 unnest 作为 hive 侧面视图的替代品 explode:
SELECT student, score
FROM tests
CROSS JOIN UNNEST(scores) AS t (score);
所以我尝试将其应用到我的 table:
characters as (
select
jdata.characters
from latest_snapshot_events
cross join unnest(json_data) as t(jdata)
)
select * from characters;
其中 json_data
是 latest_snapshot_events 中包含 属性 'characters' 的字段,它是一个如上所示的数组。
这 return 是一个错误:
[Simba]AthenaJDBC An error has been thrown from the AWS Athena client. SYNTAX_ERROR: line 69:12: Column alias list has 1 entries but 't' has 2 columns available
如何 unnest/explode latest_snapshot_events.json_data['characters']
到多行?
由于 characters
是文本表示中的 JSON 数组,您必须:
- 用
json_parse
to produce a value of type JSON 解析 JSON 文本。
- 使用
CAST
. 将 JSON 值转换为 SQL 数组
- 使用
UNNEST
展开数组。
例如:
WITH data(characters) AS (
VALUES '[{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":10,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":3},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":39,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":2},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":6801450488388220,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":1,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":8355588830097610,"shards":0,"CHAR_TPIECES":5,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4}]'
)
SELECT entry
FROM data, UNNEST(CAST(json_parse(characters) AS array(json))) t(entry)
产生:
entry
-----------------------------------------------------------------------
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,...
在上面的示例中,我将 JSON 值转换为 array(json)
,但是
你可以进一步将它转换为更具体的东西,如果每个
数组条目具有常规模式。例如,对于您的数据,它是
可以将其转换为 array(map(varchar, json))
,因为
array 是一个 JSON 对象。
如果您的初始数据是 JSON 字符串,则 json_parse
有效。但是,对于 array(row)
类型(即 objects/dictionaries 的数组),转换为 array(json)
会将每一行转换为数组,从对象中删除所有键并阻止您使用点表示法或json_extract
函数。
要取消嵌套array(row)
数据,语法要简单得多:
CROSS JOIN UNNEST(my_array) AS my_row
我在尝试逆透视数据时遇到了这个错误。
这可能对某人有帮助:
SELECT a_col, b_col
FROM
(
SELECT MAP(
ARRAY['a', 'b', 'c', 'd'],
ARRAY[1, 2, 3, 4]
) my_col
) CROSS JOIN UNNEST(my_col) as t(a_col, b_col)
t() 允许您将多个列定义为输出。
我有一些 json 数据,其中包括 属性 'characters',它看起来像这样:
select json_data['characters'] from latest_snapshot_events
Returns: [{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":10,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":3},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":39,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":2},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":6801450488388220,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":1,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":8355588830097610,"shards":0,"CHAR_TPIECES":5,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4}]
这是 return 在一行中编辑的。我想要数组中的每个项目一行。
我发现一些 SO 帖子和其他博客建议我使用 unnest()
。我已经尝试过几次,但无法得到 return 的结果。例如,here is the documentation from presto。底部覆盖 unnest 作为 hive 侧面视图的替代品 explode:
SELECT student, score
FROM tests
CROSS JOIN UNNEST(scores) AS t (score);
所以我尝试将其应用到我的 table:
characters as (
select
jdata.characters
from latest_snapshot_events
cross join unnest(json_data) as t(jdata)
)
select * from characters;
其中 json_data
是 latest_snapshot_events 中包含 属性 'characters' 的字段,它是一个如上所示的数组。
这 return 是一个错误:
[Simba]AthenaJDBC An error has been thrown from the AWS Athena client. SYNTAX_ERROR: line 69:12: Column alias list has 1 entries but 't' has 2 columns available
如何 unnest/explode latest_snapshot_events.json_data['characters']
到多行?
由于 characters
是文本表示中的 JSON 数组,您必须:
- 用
json_parse
to produce a value of type JSON 解析 JSON 文本。 - 使用
CAST
. 将 JSON 值转换为 SQL 数组
- 使用
UNNEST
展开数组。
例如:
WITH data(characters) AS (
VALUES '[{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":10,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":3},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":39,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":2},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":6801450488388220,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":1,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":8355588830097610,"shards":0,"CHAR_TPIECES":5,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4}]'
)
SELECT entry
FROM data, UNNEST(CAST(json_parse(characters) AS array(json))) t(entry)
产生:
entry
-----------------------------------------------------------------------
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,...
在上面的示例中,我将 JSON 值转换为 array(json)
,但是
你可以进一步将它转换为更具体的东西,如果每个
数组条目具有常规模式。例如,对于您的数据,它是
可以将其转换为 array(map(varchar, json))
,因为
array 是一个 JSON 对象。
json_parse
有效。但是,对于 array(row)
类型(即 objects/dictionaries 的数组),转换为 array(json)
会将每一行转换为数组,从对象中删除所有键并阻止您使用点表示法或json_extract
函数。
要取消嵌套array(row)
数据,语法要简单得多:
CROSS JOIN UNNEST(my_array) AS my_row
我在尝试逆透视数据时遇到了这个错误。
这可能对某人有帮助:
SELECT a_col, b_col
FROM
(
SELECT MAP(
ARRAY['a', 'b', 'c', 'd'],
ARRAY[1, 2, 3, 4]
) my_col
) CROSS JOIN UNNEST(my_col) as t(a_col, b_col)
t() 允许您将多个列定义为输出。