如何在 JSON 对象中循环遍历 JSON 数组

Question

我一直在努力学习 R，我有一个 JSON 文件，里面全是单行 JSON 对象，每个对象都有一个帐户数据数组。我想要做的是解析每一行，然后从解析的 JSON 对象中获取 JSON 数组，提取帐户类型和金额。但我的问题是我不知道如何最好地提取这两个属性。

我尝试使用 dplyr 包从我的每个 JSON 行中提取“accountHistory”，但我收到控制台错误。当我尝试时：

select(JsonAcctData, "accountHistory.type", "accountHistory.amount")

实际情况是，我的代码仅 returns 每行类型和金额的最后一个帐户。

现在我的代码正在写入一个 csv 文件，我可以看到我需要的所有数据，但我只想删除 ext

library("rjson")
library("dplyr")

parseJsonData <- function (sourceFile, outputFile) 
{
  #Get all total lines in the source file provided
  totalLines <- readLines(sourceFile)

  #Clean up old output file
  if(file.exists(outputFile)){
    file.remove(outputFile)
  }

  #Loop over each line in the sourceFile, 
  #parse the JSON and append to DataFrame
  JsonAcctData <- NULL
  for(i in 1:length(totalLines)){
    jsonValue <- fromJSON(totalLines[[i]])
    frame <- data.frame(jsonValue)
    JsonAcctData <- rbind(JsonAcctData, frame)
  }

  #Try to get filtered data
  filteredColumns <- 
    select(JsonAcctData, "accountHistory.type", "accountHistory.amount")
  print(filteredColumns)

  #Write the DataFrame to the output file in CSV format
  write.csv(JsonAcctData, file = outputFile)

}

测试JSON文件数据：

{"name":"Test1", "accountHistory":[{"amount":"107.62","date":"2012-02- 
  02T06:00:00.000Z","business":"CompanyA","name":"Home Loan Account 
  6220","type":"payment","account":"11111111"}, 
  {"amount":"650.88","date":"2012-02- 
  02T06:00:00.000Z","business":"CompanyF","name":"Checking Account 
  9001","type":"payment","account":"123123123"}, 
  {"amount":"878.63","date":"2012-02- 
  02T06:00:00.000Z","business":"CompanyG","name":"Money Market Account 
  8743","type":"deposit","account":"123123123"}]}
  {"name":"Test2", "accountHistory":[{"amount":"199.29","date":"2012-02-            
  02T06:00:00.000Z","business":"CompanyB","name":"Savings Account 
  3580","type":"invoice","account":"12312312"}, 
  {"amount":"841.48","date":"2012-02- 
  02T06:00:00.000Z","business":"Company","name":"Home Loan Account 
  5988","type":"payment","account":"123123123"}, 
  {"amount":"116.55","date":"2012-02- 
  02T06:00:00.000Z","business":"Company","name":"Auto Loan Account 
  1794","type":"withdrawal","account":"12312313"}]}

我希望得到一个 csv，其中只包含帐户类型和每个帐户中持有的金额。

Answer 1

这是使用 regex 的方法（在 base R 中）

# read json 
json <- readLines('test.json', warn = FALSE)
# extract with regex
amount <- grep('\"amount\":\"\d+\.\d+\"', json, value = TRUE)
amount <- as.numeric(gsub('.*amount\":\"(\d+\.+\d+)\".*', '\1', amount, perl = TRUE))
type   <- grep('\"type\":\"\w+\"', json, value = TRUE)
type   <- gsub('.*type\":\"(\w+)\".*', '\1', type, perl = TRUE)
# output
data.frame(type, amount)
#         type amount
# 1    payment 107.62
# 2    payment 650.88
# 3    deposit 878.63
# 4    invoice 199.29
# 5    payment 841.48
# 6 withdrawal 116.55

如何在 JSON 对象中循环遍历 JSON 数组

How to loop over JSON array in a JSON object

r

rjson

dplyr