Amazon Athena - 嵌套 JSON 阅读

Amazon Athena - Nested JSON Read

我有以下json对象结构

{
  "id": "1",
  "media": {
    "twitter": "",
    "facebook": "{\"id\":\"9999\",\"first_name\":\"abc\",\"last_name\":\"xyz\",\"name\":\"abc xyz\"}"
  }
}

下面是table定义

CREATE EXTERNAL TABLE Ext_JSON_data(
id string,
media map<string,struct<id:string,first_name:string,last_name:string,name:string,email:string>>
  )
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe' 
WITH SERDEPROPERTIES (  
'serialization.format' = '1'
  )
LOCATION
  's3://bucket/folder/'

谁能帮我阅读 Athena 中的这个 JSON 数据

我在外部 table 中创建了一个基于字符串的列,并在外部 table 上应用了以下示例查询以获得所需的结果。 它可能对某人有帮助!!!

CREATE EXTERNAL TABLE Ext_JSON_data(
id string,
media string
  )
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe' 
WITH SERDEPROPERTIES (  
'serialization.format' = '1'
  )
LOCATION
  's3://bucket/folder/'  

下面是示例查询:

WITH the_table AS (
SELECT CAST(social AS MAP(VARCHAR, JSON)) AS social_data
  FROM (
    VALUES
    (JSON '{
    "twitter": "",
    "facebook": "{\"id\":\"9999\",\"first_name\":\"abc\",\"last_name\":\"xyz\",\"name\":\"abc xyz\"}"}')
) AS t (social)    
)  
SELECT 
    first_level_key AS first_level_key
   ,json_extract_scalar(first_level_value,'$')
   ,json_extract_scalar(json_extract_scalar(first_level_value,'$'), '$.id')
   ,json_extract_scalar(json_extract_scalar(first_level_value,'$'), '$.first_name')
  FROM the_table
  CROSS JOIN UNNEST (social_data) AS t (first_level_key, first_level_value)