dplyr 错误 group_by %>% summarize_if()
Error in dplyr group_by %>% summarize_if()
我正在处理一个相当小的数据集,试图按平均值汇总列,同时按第一列分组。目前我有一个 df (LitterMean) 看起来是这样的:
date3 TotalBorn LiveBorn StillBorn Mummies
1 7/6 12 12 0 0
2 7/6 20 15 2 3
3 6/29 14 14 0 0
4 7/6 11 10 1 0
5 7/6 16 15 1 0
6 7/6 11 11 0 0
我尝试运行
LitterMean %>%
group_by(date3) %>%
summarize_if(LitterMean, is.numeric, mean, na.rm=TRUE)
哪个returns
Error: `.predicate` must have length 1, not 5.
Run `rlang::last_error()` to see where the error occurred.
所以我 运行 rlang::last_error() 并收到
<error/rlang_error>
`.predicate` must have length 1, not 5.
Backtrace:
1. `%>%`(...)
2. dplyr::summarize_if(., LitterMean, is.numeric, mean, na.rm = TRUE)
3. dplyr:::manip_if(...)
4. dplyr:::tbl_if_syms(.tbl, .predicate, .env, .include_group_vars = .include_group_vars)
8. dplyr:::tbl_if_vars(.tbl, .p, .env, ..., .include_group_vars = .include_group_vars)
9. dplyr:::bad_args(".predicate", "must have length 1, not {length(.p)}.")
10. dplyr:::glubort(fmt_args(args), ..., .envir = .envir)
Run `rlang::last_trace()` to see the full context.
以下显示我确实有 NA 文章。
sum(is.na(LitterMean))
[1] 5
有谁知道我的代码中是否遗漏了任何可以防止出现上述错误的内容?
你只需要正确调用summarize_if,像这样:
LitterMean %>%
group_by(date3) %>%
summarize_if(is.numeric, mean, na.rm=TRUE)
预期结果:
> LitterMean %>%
+ group_by(date3) %>%
+ summarize_if(is.numeric, mean, na.rm=TRUE)
# A tibble: 2 × 5
date3 TotalBorn LiveBorn StillBorn Mummies
<chr> <dbl> <dbl> <dbl> <dbl>
1 6/29 14 14 0 0
2 7/6 14 12.6 0.8 0.6
你应该使用 across
:
“作用域动词 (_if, _at, _all) 已被现有动词中的 across() 所取代。有关详细信息,请参阅 vignette("colwise")。”
https://dplyr.tidyverse.org/reference/summarise_all.html
library(dplyr)
df %>%
group_by(date3) %>%
summarise(across(where(is.numeric), mean))
date3 TotalBorn LiveBorn StillBorn Mummies
<chr> <dbl> <dbl> <dbl> <dbl>
1 6/29 14 14 0 0
2 7/6 14 12.6 0.8 0.6