如何找到 R 中某些行的总和以获得每行的总计?
How to find sum of certain rows in R to get a grand total per row?
我有一个包含每个月员工容量的数据集,我想获得每个员工所有月份的总数:
library(dplyr)
data <- tibble(employee = c("Justin", "Corey","Sibley", "Justin", "Corey","Sibley"),
education = c("graudate", "student", "student", "graudate", "student", "student"),
fte_max_capacity = c(1, 2, 3, 1, 2, 3),
project = c("big", "medium", "small", "medium", "small", "small"),
aug_2021 = c(1, 1, 1, 1, 1, 1),
sep_2021 = c(1, 1, 1, 1, 1, 1),
oct_2021 = c(1, 1, 1, 1, 1, 1),
nov_2021 = c(1, 1, 1, 1, 1, 1))
我已尝试使用找到的代码 进行操作,但出现此错误:
data %>%
dplyr::select(-contains("project")) %>%
dplyr::group_by(employee) %>%
mutate(sum = rowSums(select(., vars(contains("_20")))))
Error: Problem with `mutate()` input `sum`.
x Must subset columns with a valid subscript vector.
x Subscript has the wrong type `quosures`.
ℹ It must be numeric or character.
ℹ Input `sum` is `rowSums(select(., vars(contains("_20"))))`.
ℹ The error occurred in group 1: employee = "Corey".
我还尝试了此 website 解决方案的修改版本。但是我也得到了一个错误,尽管所有相关的列都是数字:
data %>%
dplyr::select(-contains("project")) %>%
dplyr::group_by(employee) %>%
mutate_at(vars(contains("_20"), rowSums(., na.rm = T)))
Error: 'x' must be numeric
是分组数据,使用cur_data()
做select
否则,分组变量也会作为属性出现,从而导致错误
library(dplyr)
data %>%
dplyr::select(-contains("project")) %>%
dplyr::group_by(employee) %>%
dplyr::mutate(sum = sum(rowSums(select(cur_data(), contains("_20"))))) %>%
ungroup
-输出
# A tibble: 6 x 8
employee education fte_max_capacity aug_2021 sep_2021 oct_2021 nov_2021 sum
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Justin graudate 1 1 1 1 1 8
2 Corey student 2 1 1 1 1 8
3 Sibley student 3 1 1 1 1 8
4 Justin graudate 1 1 1 1 1 8
5 Corey student 2 1 1 1 1 8
6 Sibley student 3 1 1 1 1 8
我有一个包含每个月员工容量的数据集,我想获得每个员工所有月份的总数:
library(dplyr)
data <- tibble(employee = c("Justin", "Corey","Sibley", "Justin", "Corey","Sibley"),
education = c("graudate", "student", "student", "graudate", "student", "student"),
fte_max_capacity = c(1, 2, 3, 1, 2, 3),
project = c("big", "medium", "small", "medium", "small", "small"),
aug_2021 = c(1, 1, 1, 1, 1, 1),
sep_2021 = c(1, 1, 1, 1, 1, 1),
oct_2021 = c(1, 1, 1, 1, 1, 1),
nov_2021 = c(1, 1, 1, 1, 1, 1))
我已尝试使用找到的代码
data %>%
dplyr::select(-contains("project")) %>%
dplyr::group_by(employee) %>%
mutate(sum = rowSums(select(., vars(contains("_20")))))
Error: Problem with `mutate()` input `sum`.
x Must subset columns with a valid subscript vector.
x Subscript has the wrong type `quosures`.
ℹ It must be numeric or character.
ℹ Input `sum` is `rowSums(select(., vars(contains("_20"))))`.
ℹ The error occurred in group 1: employee = "Corey".
我还尝试了此 website 解决方案的修改版本。但是我也得到了一个错误,尽管所有相关的列都是数字:
data %>%
dplyr::select(-contains("project")) %>%
dplyr::group_by(employee) %>%
mutate_at(vars(contains("_20"), rowSums(., na.rm = T)))
Error: 'x' must be numeric
是分组数据,使用cur_data()
做select
否则,分组变量也会作为属性出现,从而导致错误
library(dplyr)
data %>%
dplyr::select(-contains("project")) %>%
dplyr::group_by(employee) %>%
dplyr::mutate(sum = sum(rowSums(select(cur_data(), contains("_20"))))) %>%
ungroup
-输出
# A tibble: 6 x 8
employee education fte_max_capacity aug_2021 sep_2021 oct_2021 nov_2021 sum
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Justin graudate 1 1 1 1 1 8
2 Corey student 2 1 1 1 1 8
3 Sibley student 3 1 1 1 1 8
4 Justin graudate 1 1 1 1 1 8
5 Corey student 2 1 1 1 1 8
6 Sibley student 3 1 1 1 1 8