如何在 dplyr 中用不等列(反向 toString)分隔
How to separate with unequal column (reverse toString) in dplyr
我正在处理调查数据,试图在单个列中提供多个响应。问题是可能有1-5个答案,用逗号隔开。
如何打开这个:
df <- data.frame(
splitThis = c("A,B,C","B,C","A,C","A","B","C")
)
> df
splitThis
1 A,B,C
2 B,C
3 A,C
4 A
5 B
6 C
进入这个:
intoThis <- data.frame(
A = c(1,0,1,1,0,0),
B = c(1,1,0,0,1,0),
c = c(1,1,1,0,0,1)
)
> intoThis
A B c
1 1 1 1
2 0 1 1
3 1 0 1
4 1 0 0
5 0 1 0
6 0 0 1
感谢任何争论的帮助!
我们可以在 ,
拆分后使用 qdapTools
中的 mtabulate
library(qdapTools)
mtabulate(strsplit(as.character(df$splitThis), ","))
# A B C
#1 1 1 1
#2 0 1 1
#3 1 0 1
#4 1 0 0
#5 0 1 0
#6 0 0 1
OP 也提到了 dplyr/tidyr
library(dplyr)
library(tidyr)
library(tibble)
rownames_to_column(df, "rn") %>%
separate_rows(splitThis) %>%
table()
或使用 tidyverse
包
rownames_to_column(df, "rn") %>%
separate_rows(splitThis) %>%
group_by(rn, splitThis) %>%
tally %>%
spread(splitThis, n, fill=0) %>%
ungroup() %>%
select(-rn)
# A tibble: 6 × 3
# A B C
#* <dbl> <dbl> <dbl>
#1 1 1 1
#2 0 1 1
#3 1 0 1
#4 1 0 0
#5 0 1 0
#6 0 0 1
我正在处理调查数据,试图在单个列中提供多个响应。问题是可能有1-5个答案,用逗号隔开。
如何打开这个:
df <- data.frame(
splitThis = c("A,B,C","B,C","A,C","A","B","C")
)
> df
splitThis
1 A,B,C
2 B,C
3 A,C
4 A
5 B
6 C
进入这个:
intoThis <- data.frame(
A = c(1,0,1,1,0,0),
B = c(1,1,0,0,1,0),
c = c(1,1,1,0,0,1)
)
> intoThis
A B c
1 1 1 1
2 0 1 1
3 1 0 1
4 1 0 0
5 0 1 0
6 0 0 1
感谢任何争论的帮助!
我们可以在 ,
qdapTools
中的 mtabulate
library(qdapTools)
mtabulate(strsplit(as.character(df$splitThis), ","))
# A B C
#1 1 1 1
#2 0 1 1
#3 1 0 1
#4 1 0 0
#5 0 1 0
#6 0 0 1
OP 也提到了 dplyr/tidyr
library(dplyr)
library(tidyr)
library(tibble)
rownames_to_column(df, "rn") %>%
separate_rows(splitThis) %>%
table()
或使用 tidyverse
包
rownames_to_column(df, "rn") %>%
separate_rows(splitThis) %>%
group_by(rn, splitThis) %>%
tally %>%
spread(splitThis, n, fill=0) %>%
ungroup() %>%
select(-rn)
# A tibble: 6 × 3
# A B C
#* <dbl> <dbl> <dbl>
#1 1 1 1
#2 0 1 1
#3 1 0 1
#4 1 0 0
#5 0 1 0
#6 0 0 1