读取和格式化多级、不均匀 JSON
Reading and formatting Multilevel, Uneven JSON
我有一个 JSON 如下所示
{
"timestamps": [
"2020-12-17T20:05:00Z",
"2020-12-17T20:10:00Z",
"2020-12-17T20:15:00Z",
"2020-12-17T20:20:00Z",
"2020-12-17T20:25:00Z",
"2020-12-17T20:30:00Z"
],
"properties": [
{
"values": [
-20.58975828559592,
-19.356728999226693,
-19.808982964173023,
-19.673928070777993,
-19.712275037138411,
-19.48422739982918
],
"name": "Neg Flow",
"type": "Double"
},
{
"values": [
2,
20,
19,
20,
19,
16
],
"name": "Event Count",
"type": "Long"
}
],
"progress": 100.0
}
如何将其转换为如下所示的数据框。虽然我能够遍历各个数据项,但我很想知道是否有一种巧妙的方法来做到这一点?
+----------------------+---------------------+-------------+
|Time Stamps | Neg Flow | Event Count |
+----------------------+---------------------+-------------+
|2020-12-17T20:05:00Z |-20.58975828559592 | 2 |
+----------------------+---------------------+-------------+
|2020-12-17T20:10:00Z |-19.356728999226693 | 20 |
+----------------------+---------------------+-------------+
这是一种方法。
library(jsonlite) # read json
library(dplyr) # maniputate data frame
library(magrittr) # for the use of %<>%
# temp.json is my file using the content you provided
json_data <- read_json("temp.json")
# initial data with timestamp
data <- tibble(`Time Stamps` = unlist(json_data[["timestamps"]]))
# properties process
for (property in json_data[["properties"]]) {
property_name <- property[["name"]]
# using dynamic namming for more reference please refer to link at end of post
data %<>% mutate({{property_name}} := unlist(property[["values"]]))
}
输出:
# A tibble: 6 x 3
`Time Stamps` `Neg Flow` `Event Count`
<chr> <dbl> <int>
1 2020-12-17T20:05:00Z -20.6 2
2 2020-12-17T20:10:00Z -19.4 20
3 2020-12-17T20:15:00Z -19.8 19
4 2020-12-17T20:20:00Z -19.7 20
5 2020-12-17T20:25:00Z -19.7 19
6 2020-12-17T20:30:00Z -19.5 16
在此处了解有关使用 dplyr
编程的更多信息:
https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html
我有一个 JSON 如下所示
{
"timestamps": [
"2020-12-17T20:05:00Z",
"2020-12-17T20:10:00Z",
"2020-12-17T20:15:00Z",
"2020-12-17T20:20:00Z",
"2020-12-17T20:25:00Z",
"2020-12-17T20:30:00Z"
],
"properties": [
{
"values": [
-20.58975828559592,
-19.356728999226693,
-19.808982964173023,
-19.673928070777993,
-19.712275037138411,
-19.48422739982918
],
"name": "Neg Flow",
"type": "Double"
},
{
"values": [
2,
20,
19,
20,
19,
16
],
"name": "Event Count",
"type": "Long"
}
],
"progress": 100.0
}
如何将其转换为如下所示的数据框。虽然我能够遍历各个数据项,但我很想知道是否有一种巧妙的方法来做到这一点?
+----------------------+---------------------+-------------+
|Time Stamps | Neg Flow | Event Count |
+----------------------+---------------------+-------------+
|2020-12-17T20:05:00Z |-20.58975828559592 | 2 |
+----------------------+---------------------+-------------+
|2020-12-17T20:10:00Z |-19.356728999226693 | 20 |
+----------------------+---------------------+-------------+
这是一种方法。
library(jsonlite) # read json
library(dplyr) # maniputate data frame
library(magrittr) # for the use of %<>%
# temp.json is my file using the content you provided
json_data <- read_json("temp.json")
# initial data with timestamp
data <- tibble(`Time Stamps` = unlist(json_data[["timestamps"]]))
# properties process
for (property in json_data[["properties"]]) {
property_name <- property[["name"]]
# using dynamic namming for more reference please refer to link at end of post
data %<>% mutate({{property_name}} := unlist(property[["values"]]))
}
输出:
# A tibble: 6 x 3
`Time Stamps` `Neg Flow` `Event Count`
<chr> <dbl> <int>
1 2020-12-17T20:05:00Z -20.6 2
2 2020-12-17T20:10:00Z -19.4 20
3 2020-12-17T20:15:00Z -19.8 19
4 2020-12-17T20:20:00Z -19.7 20
5 2020-12-17T20:25:00Z -19.7 19
6 2020-12-17T20:30:00Z -19.5 16
在此处了解有关使用 dplyr
编程的更多信息:
https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html