使用 R 解析 JSONP 文件

Parsing JSONP files using R

JSON 这里是新手。你能帮忙用 R 解析 JSON 文件吗?我试过 jsonlite 和 rjson,但总是出错。

下面是通过 api.

data <- GET("http://svcs.ebay.com/services/search/FindingService/v1?OPERATION-NAME=findItemsByKeywords&SERVICE-VERSION=1.0.0&SECURITY-APPNAME=GLOBAL-ID=EBAY-US&RESPONSE-DATA-FORMAT=JSON&callback=_cb_findItemsByKeywords&REST-PAYLOAD&keywords=harry%20potter&paginationInput.entriesPerPage=10")

JSON 看起来像这样:

                        "Harry Potter: Complete 8-Film Collection (DVD, 2011, 8-Disc Set)"
                              "DVDs & Blu-ray Discs"
                        "Franklin Park,IL,USA"
                              "Brand New"

首先,您的 json 文件似乎有点问题。它应该从左括号 "[".



obj <- fromJSON(file = "v2.json")

obj 中返回了一个包含 v2.json 内容的列表。


obj <- read.table("v2.json", sep = "\n", stringsAsFactors = FALSE, quote = "")

# Gets the first line with the string "[" ("\" for scape)
firstline <- grep("\[", obj[,1])[1]

# Gets the position of the string "[" in the line
fpos <- which(strsplit(obj[firstline, 1], "")[[1]] == "[")

# Gets the last line with the string "]"
lastline <- grep("\]", obj[,1])
lastline <- lastline[length(lastline)]

# Gets the position of the string "]" in the line
lpos <- which(strsplit(obj[lastline, 1], "")[[1]] == "]")

# Changes the lines with the first "[" and the last "]" to keep the text
# between both (after "[" and before "]") if there is any.
obj[firstline, 1] <- str_sub(obj[firstline, 1], fpos)
obj[lastline, 1] <- str_sub(obj[lastline, 1], 1, lpos)

obj2 <- data.frame(obj[firstline:lastline, 1])
write.table(obj2, "v3.json", row.names = FALSE, col.names = FALSE, quote = FALSE)

obj3 <- fromJSON(file = "v3.json")

问题是你的数据不是json,而是JavaScript,确切地说是jsonp。如果你只想解析 JSON 数据,你必须去掉填充回调函数。

req <- httr::GET("http://svcs.ebay.com/services/search/FindingService/v1?OPERATION-NAME=findItemsByKeywords&SERVICE-VERSION=1.0.0&SECURITY-APPNAME=YOUR-APP-123456&GLOBAL-ID=EBAY-US&RESPONSE-DATA-FORMAT=JSON&callback=_cb_findItemsByKeywords&REST-PAYLOAD&keywords=harry%20potter&paginationInput.entriesPerPage=10")
txt <- content(req, "text")
json <- sub("/**/_cb_findItemsByKeywords(", "", txt, fixed = TRUE)
json <- sub(")$", "", json)
mydata <- jsonlite::fromJSON(json)

额外功劳:或者您可以使用实际的 JavaScript 引擎来解析 JavaScript:

ctx <- V8::v8()
ctx$eval("var out;")
ctx$eval("function _cb_findItemsByKeywords(x){out = x;}")
mydata <- ctx$get("out")