将变量中的字符串条目转换为多个变量R
Convert a string entry in a variable into multiple variables R
我的数据框中有一个字符串变量类型,它有一个长字符串(它是一个 JSON 响应),其中包含我想要的列的名称及其后面的值。
我的数据框如下所示:
- 每一行都是一个参与者
- Participant列是每个参与者的列表
- Responses 有一个带有 JSON 响应的字符串条目,我希望条目的开头是变量,“:”之后是什么成为价值。
Participant
Responses
Emily
{"participantAge":"40","participantEducation":"Bachelors"}
Doug
{"participantAge":"35","participantEducation":"Bachelors"}
因此,例如,目标是将 participantAge 列作为条目,将 participantEducation 作为列条目
Participant
Responses
participantAge
participantEducation
Emily
{"}
40
Bachelors
Doug
{"}
35
Bachelors
我之前用 python 可以通过将 JSON 响应转换为字典来做到这一点,但我不确定如何在 R 中实现它。
您可以使用 dplyr
和 jsonlite
按如下方式执行此操作
library(dplyr)
library(jsonlite)
df %>%
rowwise() %>%
mutate(Response = list(parse_json(Response))) %>%
unnest_wider(Response)
输出:
Participant participantAge participantEducation
<chr> <chr> <chr>
1 Emily 35 Bachelors
2 Doug 40 Bachelors
输入:
df = structure(list(Participant = c("Emily", "Doug"), Response = c("{\"participantAge\":\"35\",\"participantEducation\":\"Bachelors\"}",
"{\"participantAge\":\"40\",\"participantEducation\":\"Bachelors\"}"
)), class = "data.frame", row.names = c(NA, -2L))
您可以尝试 jsonlite
包:
library("jsonlite")
dat_df <- data.frame(Emily='{"participantAge":"40","participantEducation":"Bachelors"}',
Doug='{"participantAge":"35","participantEducation":"Bachelors"}')
fromJSON_rec <- apply(dat_df, 2, fromJSON)
new_df <- data.frame(matrix(NA, nrow=2, ncol=3))
colnames(new_df) <- c("Participant", "participantAge", "participantEducation")
for(i in 1:length(fromJSON_rec)){
new_df[i,] <- c(names(fromJSON_rec)[i],
fromJSON_rec[[names(fromJSON_rec)[i]]][["participantAge"]],
fromJSON_rec[[names(fromJSON_rec)[i]]][["participantEducation"]])
}
> dat_df
Emily Doug
1 {"participantAge":"40","participantEducation":"Bachelors"} {"participantAge":"35","participantEducation":"Bachelors"}
> new_df
Participant participantAge participantEducation
1 Emily 40 Bachelors
2 Doug 35 Bachelors
我的数据框中有一个字符串变量类型,它有一个长字符串(它是一个 JSON 响应),其中包含我想要的列的名称及其后面的值。
我的数据框如下所示:
- 每一行都是一个参与者
- Participant列是每个参与者的列表
- Responses 有一个带有 JSON 响应的字符串条目,我希望条目的开头是变量,“:”之后是什么成为价值。
Participant | Responses |
---|---|
Emily | {"participantAge":"40","participantEducation":"Bachelors"} |
Doug | {"participantAge":"35","participantEducation":"Bachelors"} |
因此,例如,目标是将 participantAge 列作为条目,将 participantEducation 作为列条目
Participant | Responses | participantAge | participantEducation |
---|---|---|---|
Emily | {"} | 40 | Bachelors |
Doug | {"} | 35 | Bachelors |
我之前用 python 可以通过将 JSON 响应转换为字典来做到这一点,但我不确定如何在 R 中实现它。
您可以使用 dplyr
和 jsonlite
library(dplyr)
library(jsonlite)
df %>%
rowwise() %>%
mutate(Response = list(parse_json(Response))) %>%
unnest_wider(Response)
输出:
Participant participantAge participantEducation
<chr> <chr> <chr>
1 Emily 35 Bachelors
2 Doug 40 Bachelors
输入:
df = structure(list(Participant = c("Emily", "Doug"), Response = c("{\"participantAge\":\"35\",\"participantEducation\":\"Bachelors\"}",
"{\"participantAge\":\"40\",\"participantEducation\":\"Bachelors\"}"
)), class = "data.frame", row.names = c(NA, -2L))
您可以尝试 jsonlite
包:
library("jsonlite")
dat_df <- data.frame(Emily='{"participantAge":"40","participantEducation":"Bachelors"}',
Doug='{"participantAge":"35","participantEducation":"Bachelors"}')
fromJSON_rec <- apply(dat_df, 2, fromJSON)
new_df <- data.frame(matrix(NA, nrow=2, ncol=3))
colnames(new_df) <- c("Participant", "participantAge", "participantEducation")
for(i in 1:length(fromJSON_rec)){
new_df[i,] <- c(names(fromJSON_rec)[i],
fromJSON_rec[[names(fromJSON_rec)[i]]][["participantAge"]],
fromJSON_rec[[names(fromJSON_rec)[i]]][["participantEducation"]])
}
> dat_df
Emily Doug
1 {"participantAge":"40","participantEducation":"Bachelors"} {"participantAge":"35","participantEducation":"Bachelors"}
> new_df
Participant participantAge participantEducation
1 Emily 40 Bachelors
2 Doug 35 Bachelors