汇总具有相同列的多个数据框的多列
Summarizing Mulitple Colums for Multiple Dataframes with Same Columns
所以我有 7 个具有完全相同数据结构的数据帧:
# A tibble: 6 x 25
Full.Name `1_2019` `1_2020` `10_2019` `10_2020` `11_2019` `11_2020` `12_2019` `12_2020` `2_2019` `2_2020` `3_2019` `3_2020` `4_2019`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A. Patri~ 0.0108 0.00909 0.0121 0.0181 0.0112 0.0197 0.0133 0.0164 0.0191 0.0188 0.0207 0.0196 0.0164
2 Aaron P.~ 0 0 0 0 0 0 0 0 0 0 0 0 0.0714
3 Aaron P.~ 0 0 0 0 0 0 0 0 0 0 0 0 0
4 Adam H. ~ 0 0 0 0 0 0.0227 0 0 0 0 0 0.0182 0
5 Adam P. ~ 0 0.0123 0.0159 0.04 0.0153 0 0 0 0.0294 0.0177 0.00820 0 0
6 Adena T.~ 0.0104 0.0148 0.0252 0.0270 0.0185 0.0349 0.0240 0.0370 0.0175 0 0.0134 0.0116 0.0142
# ... with 11 more variables: `4_2020` <dbl>, `5_2019` <dbl>, `5_2020` <dbl>, `6_2019` <dbl>, `6_2020` <dbl>, `7_2019` <dbl>,
# `7_2020` <dbl>, `8_2019` <dbl>, `8_2020` <dbl>, `9_2019` <dbl>, `9_2020` <dbl>
所有 7 个数据帧都获得了相同的 Full.Name
值,并且所有列都相同。唯一的区别是 x_20xx
列的不同值。
我想将它们汇总到一个新的数据框中,将每个名称行和月份列的值相加。新数据框应具有相同的列,并且 Full.Name
列必须完全相同。其他列必须是所有 7 个数据帧的总和。
感谢任何帮助。出于实验目的,您可以简单地将提供的数据帧复制到 7 个数据帧。
dput()
输出如下:
structure(list(Full.Name = c("A. Patrick Beharelle", "Aaron P. Graft",
"Aaron P. Jagdfeld", "Adam H. Schechter", "Adam P. Symson"),
`1_2019` = c(0.0107913669064748, 0, 0, 0, 0), `1_2020` = c(0.00909090909090909,
0, 0, 0, 0.0122699386503067), `10_2019` = c(0.0121212121212121,
0, 0, 0, 0.0158730158730159), `10_2020` = c(0.0181268882175227,
0, 0, 0, 0.04), `11_2019` = c(0.0111607142857143, 0, 0, 0,
0.0152671755725191), `11_2020` = c(0.0196779964221825, 0,
0, 0.0227272727272727, 0), `12_2019` = c(0.0133333333333333,
0, 0, 0, 0), `12_2020` = c(0.0163934426229508, 0, 0, 0, 0
), `2_2019` = c(0.0190641247833622, 0, 0, 0, 0.0294117647058824
), `2_2020` = c(0.0187793427230047, 0, 0, 0, 0.0176991150442478
), `3_2019` = c(0.0207006369426752, 0, 0, 0, 0.00819672131147541
), `3_2020` = c(0.0196078431372549, 0, 0, 0.0181818181818182,
0), `4_2019` = c(0.0164473684210526, 0.0714285714285714,
0, 0, 0), `4_2020` = c(0.0172413793103448, 0, 0, 0.0158730158730159,
0.0140845070422535), `5_2019` = c(0.0146252285191956, 0,
0, 0, 0.0222222222222222), `5_2020` = c(0.00623052959501558,
0, 0, 0.008, 0.00806451612903226), `6_2019` = c(0.0256410256410256,
0.0120481927710843, 0, 0.0434782608695652, 0.032258064516129
), `6_2020` = c(0.0300429184549356, 0, 0, 0, 0.0198019801980198
), `7_2019` = c(0.0107816711590297, 0, 0, 0, 0), `7_2020` = c(0.0108108108108108,
0, 0, 0.03125, 0), `8_2019` = c(0.0177514792899408, 0, 0,
0, 0.0306122448979592), `8_2020` = c(0.0149700598802395,
0, 0, 0.0212765957446809, 0.0909090909090909), `9_2019` = c(0.0146699266503667,
0, 0, 0.0555555555555556, 0.00917431192660551), `9_2020` = c(0.00738916256157635,
0.010989010989011, 0.2, 0, 0)), row.names = c(NA, -5L), class = c("tbl_df",
"tbl", "data.frame"))
在此先感谢您提供的每一点帮助。我很感激。
应该这样做:
alldat <- bind_rows(dat1, dat2, dat3, dat4,
dat5, dat6, dat7)
alldat %>% group_by(Full.Name) %>%
summarise(across(everything(), sum))
所以我有 7 个具有完全相同数据结构的数据帧:
# A tibble: 6 x 25
Full.Name `1_2019` `1_2020` `10_2019` `10_2020` `11_2019` `11_2020` `12_2019` `12_2020` `2_2019` `2_2020` `3_2019` `3_2020` `4_2019`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A. Patri~ 0.0108 0.00909 0.0121 0.0181 0.0112 0.0197 0.0133 0.0164 0.0191 0.0188 0.0207 0.0196 0.0164
2 Aaron P.~ 0 0 0 0 0 0 0 0 0 0 0 0 0.0714
3 Aaron P.~ 0 0 0 0 0 0 0 0 0 0 0 0 0
4 Adam H. ~ 0 0 0 0 0 0.0227 0 0 0 0 0 0.0182 0
5 Adam P. ~ 0 0.0123 0.0159 0.04 0.0153 0 0 0 0.0294 0.0177 0.00820 0 0
6 Adena T.~ 0.0104 0.0148 0.0252 0.0270 0.0185 0.0349 0.0240 0.0370 0.0175 0 0.0134 0.0116 0.0142
# ... with 11 more variables: `4_2020` <dbl>, `5_2019` <dbl>, `5_2020` <dbl>, `6_2019` <dbl>, `6_2020` <dbl>, `7_2019` <dbl>,
# `7_2020` <dbl>, `8_2019` <dbl>, `8_2020` <dbl>, `9_2019` <dbl>, `9_2020` <dbl>
所有 7 个数据帧都获得了相同的 Full.Name
值,并且所有列都相同。唯一的区别是 x_20xx
列的不同值。
我想将它们汇总到一个新的数据框中,将每个名称行和月份列的值相加。新数据框应具有相同的列,并且 Full.Name
列必须完全相同。其他列必须是所有 7 个数据帧的总和。
感谢任何帮助。出于实验目的,您可以简单地将提供的数据帧复制到 7 个数据帧。
dput()
输出如下:
structure(list(Full.Name = c("A. Patrick Beharelle", "Aaron P. Graft",
"Aaron P. Jagdfeld", "Adam H. Schechter", "Adam P. Symson"),
`1_2019` = c(0.0107913669064748, 0, 0, 0, 0), `1_2020` = c(0.00909090909090909,
0, 0, 0, 0.0122699386503067), `10_2019` = c(0.0121212121212121,
0, 0, 0, 0.0158730158730159), `10_2020` = c(0.0181268882175227,
0, 0, 0, 0.04), `11_2019` = c(0.0111607142857143, 0, 0, 0,
0.0152671755725191), `11_2020` = c(0.0196779964221825, 0,
0, 0.0227272727272727, 0), `12_2019` = c(0.0133333333333333,
0, 0, 0, 0), `12_2020` = c(0.0163934426229508, 0, 0, 0, 0
), `2_2019` = c(0.0190641247833622, 0, 0, 0, 0.0294117647058824
), `2_2020` = c(0.0187793427230047, 0, 0, 0, 0.0176991150442478
), `3_2019` = c(0.0207006369426752, 0, 0, 0, 0.00819672131147541
), `3_2020` = c(0.0196078431372549, 0, 0, 0.0181818181818182,
0), `4_2019` = c(0.0164473684210526, 0.0714285714285714,
0, 0, 0), `4_2020` = c(0.0172413793103448, 0, 0, 0.0158730158730159,
0.0140845070422535), `5_2019` = c(0.0146252285191956, 0,
0, 0, 0.0222222222222222), `5_2020` = c(0.00623052959501558,
0, 0, 0.008, 0.00806451612903226), `6_2019` = c(0.0256410256410256,
0.0120481927710843, 0, 0.0434782608695652, 0.032258064516129
), `6_2020` = c(0.0300429184549356, 0, 0, 0, 0.0198019801980198
), `7_2019` = c(0.0107816711590297, 0, 0, 0, 0), `7_2020` = c(0.0108108108108108,
0, 0, 0.03125, 0), `8_2019` = c(0.0177514792899408, 0, 0,
0, 0.0306122448979592), `8_2020` = c(0.0149700598802395,
0, 0, 0.0212765957446809, 0.0909090909090909), `9_2019` = c(0.0146699266503667,
0, 0, 0.0555555555555556, 0.00917431192660551), `9_2020` = c(0.00738916256157635,
0.010989010989011, 0.2, 0, 0)), row.names = c(NA, -5L), class = c("tbl_df",
"tbl", "data.frame"))
在此先感谢您提供的每一点帮助。我很感激。
应该这样做:
alldat <- bind_rows(dat1, dat2, dat3, dat4,
dat5, dat6, dat7)
alldat %>% group_by(Full.Name) %>%
summarise(across(everything(), sum))