通过迭代函数一步导入和合并数据集?
import and merge dataset all in one step via iterative function?
这些是我正在处理的数据集。
由于 P-ones 具有以下结构
# A tibble: 25 x 5
ID GR SES COND `LPP2(1000-1500).POz`
<chr> <chr> <chr> <chr> <dbl>
1 01 RP V NEG-CTR 7.91
2 04 RP V NEG-CTR 1.07
3 06 RP V NEG-CTR 0.742
4 07 RP V NEG-CTR 4.49
5 08 RP V NEG-CTR -2.43
6 09 RP V NEG-CTR 0.649
7 10 RP V NEG-CTR 1.10
8 11 RP V NEG-CTR -2.07
9 12 RP V NEG-CTR 10.9
10 13 RP V NEG-CTR 5.17
虽然 V-ones 有以下这些,但数量更多:
> `RP-V-NEG-CTR`
# A tibble: 25 x 16
ID GR SES COND `P3(400-450).FCz` `P3(400-450).Cz` `P3(400-450).Pz` `LPPearly(500-700)~
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 01 RP V NEG-CTR -11.6 -5.17 11.9 -11.8
2 04 RP V NEG-CTR -0.314 2.23 15.1 -4.02
3 06 RP V NEG-CTR -0.214 -1.30 3.14 4.47
4 07 RP V NEG-CTR -2.83 2.19 13.7 -0.884
5 08 RP V NEG-CTR 4.24 2.45 7.48 1.15
6 09 RP V NEG-CTR 9.57 13.7 23.4 13.3
7 10 RP V NEG-CTR -6.13 -4.13 6.27 0.0229
8 11 RP V NEG-CTR 0.529 5.43 13.0 1.90
9 12 RP V NEG-CTR -7.74 -2.41 15.8 -8.08
10 13 RP V NEG-CTR 1.27 3.48 9.94 1.75
# ... with 15 more rows, and 8 more variables: LPPearly(500-700).Cz <dbl>,
# LPPearly(500-700).Pz <dbl>, LPP1(500-1000).FCz <dbl>, LPP1(500-1000).Cz <dbl>,
# LPP1(500-1000).Pz <dbl>, LPP2(1000-1500).FCz <dbl>, LPP2(1000-1500).Cz <dbl>,
# LPP2(1000-1500).Pz <dbl>
>
我使用以下过程将它们导入并合并在一起:
files <- list.files(pattern = "\.xls")
for (i in 1:length(files)) {
assign(gsub("\.xls", "", files[i]), readxl::read_xls(files[i]))
}
data <- `RP-V-NEG-CTR` %>%
add_row(`RP-V-NEG-NOC`) %>%
add_row(`RP-V-NEU-NOC`)%>%
arrange(ID)
data1 <- `RP-POz-NEG-CTR` %>%
add_row(`RP-POz-NEG-NOC`) %>%
add_row(`RP-POz-NEU-NOC`)%>%
arrange(ID)
data <- merge(data, data1, by = c('ID', 'GR', 'SES', 'COND'))
由于尝试了以下代码,我获得了一个不匹配的合并数据集:
filenames <- list.files(pattern = '^RP.*\.xls$')
> data <- purrr::map_df(filenames, readxl::read_excel)
像这样:
如果我想以正确的方式导入已经合并的内容,我应该do/adjust做什么?
提前致谢
也许我们需要map2
library(dplyr)
library(purrr)
library(stringr)
filesv <- list.files(pattern = 'RP-V-.*\.xls', full.names = TRUE)
filesp <- list.files(pattern = 'RP-P-.*\.xls', full.names = TRUE)
nm1 <- str_c(str_remove(basename(filesv), "\.xls"),
str_remove(basename(filesp), "\.xls"), sep="_")
out <- map2(filesv, filesp, ~ {
vdat <- readxl::read_excel(.x)
pdat <- readxl::read_excel(.y)
inner_join(vdat, pdat, by = c('ID', 'GR', 'SES', 'COND'))
}) %>%
setNames(nm1) %>%
bind_rows(.id = 'grp')
或者如果我们不需要 'grp' 列
out <- map2_dfr(filesv, filesp, ~ {
vdat <- readxl::read_excel(.x)
pdat <- readxl::read_excel(.y)
inner_join(vdat, pdat, by = c('ID', 'GR', 'SES', 'COND'))
})