使用 JSON SERDE 读取 json 文件时，我只得到一行

Question

JSON数据：

[{
  "liked": "true",
  "user_id": "101",
  "video_end_type": "3",
  "minutes_played": "3",
  "video_id": "101",
  "geo_cd": "AP",
  "channel_id": "11",
  "creator_id": "101",
  "timestamp": "07/05/2019 01:36:35",
  "disliked": "true"
},
{
"liked": true, 
"user_id": 102, 
"video_end_type": null,
 "minutes_played": 4,
 "video_id": 102, 
"geo_cd": "AP",
 "channel_id": 12,
 "creator_id": 102,
 "timestamp": "15/04/2019 17:04:00", 
"disliked": true
}
]

输出：

hive> select * from stream;
OK
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL
true    101     3       3       101     AP      11      101     NULL    true
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL

我得到的输出只是第一行。

代码：

add jar json-serde-1.3.8-jar-with-dependencies.jar;

create external table stream(
    liked string, user_id int, video_end_type int, minutes_played int, video_id int, geo_cd string, channel_id int, creator_id int, time timestamp, disliked 
string)
    ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
    LOCATION '/location';

注意：我无法删除括号，因为那样的话数据的 JSON 格式不正确。

Answer 1

类似的东西对你有用：

JSON 数据：

{"liked": "true",  "user_id": "101",  "video_end_type": "3",  "minutes_played": "3",  "video_id": "101",  "geo_cd": "AP",  "channel_id": "11",  "creator_id": "101",  "timestamp": "07/05/2019 01:36:35",  "disliked": "true"}
{"liked": true, "user_id": 102, "video_end_type": null, "minutes_played": 4, "video_id": 102, "geo_cd": "AP", "channel_id": 12, "creator_id": 102, "timestamp": "15/04/2019 17:04:00", "disliked": true}

您的创建查询应如下所示：

create external table stream(
liked string,
user_id string,
video_end_type string,
minutes_played string,
video_id string,
geo_cd string,
channel_id string,
creator_id string,
`timestamp` string,
disliked string
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS TEXTFILE
LOCATION '/location/';

注意：如果你的JSON数据是一致的，你可以使用适当的数据类型而不是字符串。

使用 JSON SERDE 读取 json 文件时，我只得到一行

I am only getting one row when using JSON SERDE to read a json file

json

hive

hive-serde