Athena 在另一个 json 结构数组中取消嵌套 json 字符串数组
Athena unnest json array of string within another json array of structs
我有以下 AWS Athena create table 语句:
CREATE EXTERNAL TABLE IF NOT EXISTS s2cs3dataset.s2c_storage (
`MessageHeader` string,
`TimeToProcess` float,
`KeyCreated` string,
`KeyLastTouch` string,
`CreatedDateTime` string,
`TableReference` array<struct<`BusinessObject`: string,
`TransactionType`: string,
`ReferenceKeyId`: float,
`ReferencePrimaryKey`: string,
`IncludedTables`: array<string>>>,
`SAPStoreReference` string
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1' ) LOCATION 's3://api-dev-dpstorage-s3/S2C_INPUT/storage/' TBLPROPERTIES ('has_encrypted_data'='false');
据此,我想 select 使用此查询查询以下项目:
SELECT MessageHeader,
TimeToProcess,
KeyCreated,
KeyLastTouch,
CreatedDateTime,
tr.BusinessObject,
tr.TransactionType,
tr.ReferencePrimaryKey,
it.IncludedTables,
SAPStoreReference
FROM s2c_storage
cross join UNNEST(s2c_storage.tablereference) as p(tr)
cross join UNNEST(tr.IncludedTables) as p(it)
但是我收到以下错误:
SYNTAX_ERROR: line 9:1: Expression "it" is not of type ROW
如果我删除底部交叉连接和引用它的列,查询工作正常,所以我在尝试解压缩字符串数组的 JSON 数据时做错了结构数组。有小费吗?
根据澄清意见,tr.IncludedTables
属于 array(varchar)
类型。
因此,在查询... CROSS JOIN UNNEST(tr.IncludedTables) AS p(it)
中,it
的类型是varchar
。在 select 子句中,您可以将此值引用为 it
(或者,给一个别名:it as IncludedTables
),但不能用 it.IncludedTables
引用它(varchar
值没有 "fields",因此特别是它没有 IncludedTables
字段)。
我有以下 AWS Athena create table 语句:
CREATE EXTERNAL TABLE IF NOT EXISTS s2cs3dataset.s2c_storage (
`MessageHeader` string,
`TimeToProcess` float,
`KeyCreated` string,
`KeyLastTouch` string,
`CreatedDateTime` string,
`TableReference` array<struct<`BusinessObject`: string,
`TransactionType`: string,
`ReferenceKeyId`: float,
`ReferencePrimaryKey`: string,
`IncludedTables`: array<string>>>,
`SAPStoreReference` string
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1' ) LOCATION 's3://api-dev-dpstorage-s3/S2C_INPUT/storage/' TBLPROPERTIES ('has_encrypted_data'='false');
据此,我想 select 使用此查询查询以下项目:
SELECT MessageHeader,
TimeToProcess,
KeyCreated,
KeyLastTouch,
CreatedDateTime,
tr.BusinessObject,
tr.TransactionType,
tr.ReferencePrimaryKey,
it.IncludedTables,
SAPStoreReference
FROM s2c_storage
cross join UNNEST(s2c_storage.tablereference) as p(tr)
cross join UNNEST(tr.IncludedTables) as p(it)
但是我收到以下错误:
SYNTAX_ERROR: line 9:1: Expression "it" is not of type ROW
如果我删除底部交叉连接和引用它的列,查询工作正常,所以我在尝试解压缩字符串数组的 JSON 数据时做错了结构数组。有小费吗?
根据澄清意见,tr.IncludedTables
属于 array(varchar)
类型。
因此,在查询... CROSS JOIN UNNEST(tr.IncludedTables) AS p(it)
中,it
的类型是varchar
。在 select 子句中,您可以将此值引用为 it
(或者,给一个别名:it as IncludedTables
),但不能用 it.IncludedTables
引用它(varchar
值没有 "fields",因此特别是它没有 IncludedTables
字段)。