如何读取一个文件中包含的多个 JSON 结构？

Question

我有一个具有这种结构的 .txt 文件

section1#[{"p": "0.999834", "tag": "MA"},{"p": "1", "tag": "MO"},...etc...}]
section1#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"},...etc...}]
...
section2#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"},...etc...}]

我正在尝试通过使用 R 和命令来阅读它

library(jsonlite)
data <- fromJSON("myfile.txt")

但我明白了

Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) : 
  lexical error: invalid char in json text.
                                       section2#[{"p": "0.99
                     (right here) ------^

如何分节阅读？

Answer 1

从每行中删除 section#。然后你的 .txt 将有一个二维数组，每个索引处都有 JSON 个对象。您可以通过将其作为第一行的第一个对象 foo[0][0] 和 foo[m][n] 来访问元素，其中 m 是 number of sections -1 而 n 是 number of objects in each section -1

Answer 2

删除前缀并将展平的 JSON 数组绑定到一个数据框中：

raw_dat <- readLines(textConnection('section1#[{"p": "0.999834", "tag": "MA"},{"p": "1", "tag": "MO"}]
section1#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"}]
section2#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"}]'))

library(stringi)
library(purrr)
library(jsonlite)

stri_replace_first_regex(raw_dat, "^section[[:digit:]]+#", "") %>% 
  map_df(fromJSON)
##          p tag
## 1 0.999834  MA
## 2        1  MO
## 3   0.9995  NC
## 4        1  FL
## 5   0.9995  NC
## 6        1  FL

如何读取一个文件中包含的多个 JSON 结构？

How do I read multiple JSON structures contained in one file?

json

r

jsonlite