如何在 R 中解析具有堆叠多个 JSON 的文件?
How to parse a file with stacked multiple JSONs in R?
我在 R 中有以下 "stacked JSON" 对象,example1.json
:
{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes",
"Code":[{"event1":"A","result":"1"},…]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No",
"Code":[{"event1":"B","result":"1"},…]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No",
"Code":[{"event1":"B","result":"0"},…]}
这些不是逗号分隔的。基本目标是将某些字段(或所有字段)解析为 R data.frame 或 data.table:
Timestamp Usefulness
0 20140101 Yes
1 20140102 No
2 20140103 No
通常,我会在 JSON 中读取 R 中的内容,如下所示:
library(jsonlite)
jsonfile = "example1.json"
foobar = fromJSON(jsonfile)
然而,这会引发解析错误:
Error: lexical error: invalid char in json text.
[{"event1":"A","result":"1"},…]} {"ID":"1A35B","Timestamp"
(right here) ------^
这是一个与以下类似的问题,但在 R 中:
编辑:此文件格式称为 "newline delimited JSON"、NDJSON。
三个点 ...
使您的 JSON 无效,因此您的 lexical error
.
您可以使用 jsonlite::stream_in()
到 'stream in' 行 JSON。
library(jsonlite)
jsonlite::stream_in(file("~/Desktop/examples1.json"))
# opening file input connection.
# Imported 3 records. Simplifying...
# closing file input connection.
# ID Timestamp Usefulness Code
# 1 12345 20140101 Yes A, 1
# 2 1A35B 20140102 No B, 1
# 3 AA356 20140103 No B, 0
数据
我已经清理了您的示例数据以使其有效 JSON 并将其作为 ~/Desktop/examples1.json
保存到我的桌面
{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes","Code":[{"event1":"A","result":"1"}]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No","Code":[{"event1":"B","result":"1"}]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No","Code":[{"event1":"B","result":"0"}]}
我在 R 中有以下 "stacked JSON" 对象,example1.json
:
{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes",
"Code":[{"event1":"A","result":"1"},…]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No",
"Code":[{"event1":"B","result":"1"},…]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No",
"Code":[{"event1":"B","result":"0"},…]}
这些不是逗号分隔的。基本目标是将某些字段(或所有字段)解析为 R data.frame 或 data.table:
Timestamp Usefulness
0 20140101 Yes
1 20140102 No
2 20140103 No
通常,我会在 JSON 中读取 R 中的内容,如下所示:
library(jsonlite)
jsonfile = "example1.json"
foobar = fromJSON(jsonfile)
然而,这会引发解析错误:
Error: lexical error: invalid char in json text.
[{"event1":"A","result":"1"},…]} {"ID":"1A35B","Timestamp"
(right here) ------^
这是一个与以下类似的问题,但在 R 中:
编辑:此文件格式称为 "newline delimited JSON"、NDJSON。
三个点
...
使您的 JSON 无效,因此您的lexical error
.您可以使用
jsonlite::stream_in()
到 'stream in' 行 JSON。
library(jsonlite)
jsonlite::stream_in(file("~/Desktop/examples1.json"))
# opening file input connection.
# Imported 3 records. Simplifying...
# closing file input connection.
# ID Timestamp Usefulness Code
# 1 12345 20140101 Yes A, 1
# 2 1A35B 20140102 No B, 1
# 3 AA356 20140103 No B, 0
数据
我已经清理了您的示例数据以使其有效 JSON 并将其作为 ~/Desktop/examples1.json
{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes","Code":[{"event1":"A","result":"1"}]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No","Code":[{"event1":"B","result":"1"}]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No","Code":[{"event1":"B","result":"0"}]}