如何在 R 中打印 JSON 文件的前十行

How to print the first ten rows of a JSON file in R

我有一个 R 数据框并使用 jsonlite 包

将其转换为 json 格式
jsonData <- toJSON(dataset)

我想验证转换是否成功,数据集很大。如何打印 json 文件中的前 5 行?

将 R 对象转换为 JSON

有几个包实现了 toJSON() 功能,例如 jsonliterjsonRJSONIO。我将在下面使用 jsonlite。该解决方案也适用于 RJSONIO,但不适用于 rjson.

我还没有找到直接打印 JSON 字符串的一部分行的方法。原因是所有三个包 return 都是单个字符(即长度为 1 的字符向量)而不是字符向量,其中 JSON 字符串的每一行都占用一个元素:

length(jsonlite::toJSON(mtcars))

的确,转换后的对象是一长串文本:

jsonlite::toJSON(mtcars)
## [{"mpg":21,"cyl":6,"disp":160,"hp":110,"drat":3.9,"wt":2.62,"qsec":16.46,"vs":0,"am":1,"gear":4,"carb":4,"_row":"Mazda RX4"},{"mpg":21,"cyl":6,"disp":160,"hp":110,"drat":3.9,"wt":2.875,"qsec":17.02,"vs":0,"am":1,"gear":4,"carb":4,"_row":"Mazda RX4 Wag"},{"mpg":22.8,"cyl":4,"disp":108,"hp":93,"drat":3.85,"wt":2.32,"qsec":18.61,"vs":1,"am":1,"gear":4,"carb":1,"_row":"Datsun 710"},{"mpg":21.4,"cyl":6,"disp":258,"hp":110,"drat":3.08,"wt":3.215,"qsec":19.44,"vs":1,"am":0,"gear":3,"carb":1,"_row":"Hornet 4 Drive"},{"mpg":18.7,"cyl":8,"disp":360,"hp":175,"drat":3.15,"wt":3.44,"qsec":17.02,"vs":0,"am":0,"gear":3,"carb":2,"_row":"Hornet Sportabout"},{"mpg":18.1,"cyl":6,"disp":225,"hp":105,"drat":2.76,"wt":3.46,"qsec":20.22,"vs":1,"am":0,"gear":3,"carb":1,"_row":"Valiant"},{"mpg":14.3,"cyl":8,"disp":360,"hp":245,"drat":3.21,"wt":3.57,"qsec":15.84,"vs":0,"am":0,"gear":3,"carb":4,"_row":"Duster 360"},{"mpg":24.4,"cyl":4,"disp":146.7,"hp":62,"drat":3.69,"wt":3.19,"qsec":20,"vs":1,"am":0,"gear":4,"c... <truncated>

因为只有一行,显然只打印前几行你什么也得不到。

但是 jsonlite 中的 toJSON 函数(以及 RJSONIO 中的函数)允许您将 JSON 字符串分成几行(我将输出截断为手,因为太长):

jsonlite::toJSON(mtcars, pretty = TRUE)
## [
##   {
##     "mpg": 21,
##     "cyl": 6,
##     "disp": 160,
##     "hp": 110,
##     "drat": 3.9,
##     "wt": 2.62,
## ...

仍然是一个长度为1的字符向量,但是现在行之间用换行符分隔(\n),这样可以达到你的目的:

length(jsonlite::toJSON(mtcars, pretty = TRUE))
## [1] 1
as.character(jsonlite::toJSON(mtcars, pretty = TRUE))
## [1] "[\n  {\n    \"mpg\": 21,\n    \"cyl\": 6,\n    \"disp\": 160,\n    \"hp\": 110,\n    \"drat\": 3.9,\n    \"wt\": 2.62,\n    \"qsec\": 16.46,\n    \"vs\": 0,\n    \"am\": 1,\n    \"gear\": 4,\n    \"carb\": 4,\n    \"_row\": \"Mazda RX4\"\n  },\n  {\n    \"mpg\": 21,\n    \"cyl\": 6,\n    \"disp\": 160,\n    \"hp\": 110,\n    \"drat\": 3.9,\n    \"wt\": 2.875,\n    \"qsec\": 17.02,\n    \"vs\": 0,\n    \"am\": 1,\n    \"gear\": 4,\n    \"carb\": 4,\n    \"_row\": \"Mazda RX4 Wag\"\n  },\n  {\n    \"mpg\": 22.8,\n    \"cyl\": 4,\n    \"disp\": 108,\n    \"hp\": 93,\n    \"drat\": 3.85,\n    \"wt\": 2.32,\n    \"qsec\": 18.61,\n    \"vs\": 1,\n    \"am\": 1,\n    \"gear\": 4,\n    \"carb\": 1,\n    \"_row\": \"Datsun 710\"\n  },\n  {\n    \"mpg\": 21.4,\n    \"cyl\": 6,\n    \"disp\": 258,\n    \"hp\": 110,\n    \"drat\": 3.08,\n    \"wt\": 3.215,\n    \"qsec\": 19.44,\n    \"vs\": 1,\n    \"am\": 0,\n    \"gear\": 3,\n    \"carb\": 1,\n    \"_row\": \"Hornet 4 Drive\"\n  },\n  {\n  ... <truncated>

仅打印来自 JSON

的行的子集

我写了一个小函数,将 JSON 对象作为输入并打印其行的子集。它仅在创建 JSON 对象时使用 pretty = TRUE 时有效。在这里:

print_json_lines <- function(json, lines) {

  # break up into lines
  json_lines <- strsplit(json, "\n")[[1]]

  # get desired lines
  json_lines <- json_lines[lines]

  # print
  cat(paste(json_lines, collapse = "\n"))

  # return invisily
  invisible(json_lines)

}

它使用 strsplit() 将行拆分为一个字符向量,每行一个条目。然后可以通过 [] 的正常索引来选择行。由于简单地打印一个字符向量可能会在一行上打印多个字符串,我再次将行的子集组合成一个字符串(使用 paste())并使用 \n 分隔这些行。这导致格式良好的输出:

print_json_lines(jsonlite::toJSON(mtcars, pretty = TRUE), 1:5)
## [
##   {
##     "mpg": 21,
##     "cyl": 6,
##     "disp": 160,

关于不同 JSON 包的备注

正如我在开头提到的,此解决方案适用于 jsonliteRJSONIO。原因很简单,它们都允许您将 JSON 字符串分解为 pretty = TRUE 行。但是,当您使用 RJSONIO 时,输出看起来不同,因为它在转换中使用了不同的约定:

print_json_lines(RJSONIO::toJSON(mtcars, pretty = TRUE), 1:5)
## {
##     "mpg" : [
##             21,
##             21,
##             22.8,

该函数不适用于 rjson,因为据我所知无法将 JSON 对象拆分为多行。