如何进一步融化数据集?
How to further melt a dataset?
我正在处理以下数据:
#Reproducible Example
Bills <- c("93-HCONRES-106", "93-HCONRES-107", "93-HCONRES-108")
Members <- c("00134", "00416;00010;00017;00026", "00416;00503;00513;00568")
data <- data.frame(Bills, Members)
数据如下所示:
#Data Structure
Bills Members
1 93-HCONRES-106 00134
2 93-HCONRES-107 00416;00010;00017;00026
3 93-HCONRES-108 00416;00503;00513;00568
我想要扩展数据集,使每个账单与每个成员相对应。所以数据看起来像这样:
Bills Members
93-HCONRES-106 00134
93-HCONRES-107 00416
93-HCONRES-107 00010
93-HCONRES-107 00017
93-HCONRES-107 00026
93-HCONRES-108 00416
93-HCONRES-108 00503
93-HCONRES-108 00513
93-HCONRES-108 00568
如果您能分享任何代码,我们将不胜感激。
非常感谢您的帮助
我们可以使用 separate_rows
从 tidyr
library(dplyr)
library(tidyr)
data %>%
separate_rows(Members)
# Bills Members
#1 93-HCONRES-106 00134
#2 93-HCONRES-107 00416
#3 93-HCONRES-107 00010
#4 93-HCONRES-107 00017
#5 93-HCONRES-107 00026
#6 93-HCONRES-108 00416
#7 93-HCONRES-108 00503
#8 93-HCONRES-108 00513
#9 93-HCONRES-108 00568
或者把元素提取到list
然后unnest
library(stringr)
data %>%
mutate(Members = str_extract_all(Members, "[^;]+")) %>%
unnest(c(Members))
或 base R
stack(setNames(strsplit(as.character(data$Members), ";"), data$Bills))
使用data.table:
library(data.table)
dt1 <- data.table(Bills, Members)
dt2 <- melt(dt1[, c("V1", "V2", "V3", "V4") := tstrsplit(Members, ";")][, Members := NULL], id.vars = "Bills")[!is.na(value)][order(Bills)]
我正在处理以下数据:
#Reproducible Example
Bills <- c("93-HCONRES-106", "93-HCONRES-107", "93-HCONRES-108")
Members <- c("00134", "00416;00010;00017;00026", "00416;00503;00513;00568")
data <- data.frame(Bills, Members)
数据如下所示:
#Data Structure
Bills Members
1 93-HCONRES-106 00134
2 93-HCONRES-107 00416;00010;00017;00026
3 93-HCONRES-108 00416;00503;00513;00568
我想要扩展数据集,使每个账单与每个成员相对应。所以数据看起来像这样:
Bills Members
93-HCONRES-106 00134
93-HCONRES-107 00416
93-HCONRES-107 00010
93-HCONRES-107 00017
93-HCONRES-107 00026
93-HCONRES-108 00416
93-HCONRES-108 00503
93-HCONRES-108 00513
93-HCONRES-108 00568
如果您能分享任何代码,我们将不胜感激。
非常感谢您的帮助
我们可以使用 separate_rows
从 tidyr
library(dplyr)
library(tidyr)
data %>%
separate_rows(Members)
# Bills Members
#1 93-HCONRES-106 00134
#2 93-HCONRES-107 00416
#3 93-HCONRES-107 00010
#4 93-HCONRES-107 00017
#5 93-HCONRES-107 00026
#6 93-HCONRES-108 00416
#7 93-HCONRES-108 00503
#8 93-HCONRES-108 00513
#9 93-HCONRES-108 00568
或者把元素提取到list
然后unnest
library(stringr)
data %>%
mutate(Members = str_extract_all(Members, "[^;]+")) %>%
unnest(c(Members))
或 base R
stack(setNames(strsplit(as.character(data$Members), ";"), data$Bills))
使用data.table:
library(data.table)
dt1 <- data.table(Bills, Members)
dt2 <- melt(dt1[, c("V1", "V2", "V3", "V4") := tstrsplit(Members, ";")][, Members := NULL], id.vars = "Bills")[!is.na(value)][order(Bills)]