在 dplyr (R) 中添加 "count where"
Adding "count where" in dplyr (R)
我正在使用 R 中的 dplyr 库。我有以下数据:
col1 = as.factor(c("a", "a", "a", "b", "b", "c", "c", "c"))
col2 = c(1,1,0,0,1, 0, 0, 1)
dplyr_data = data.frame(col1, col2)
head(dplyr_data)
col1 col2
1 a 1
2 a 1
3 a 0
4 b 0
5 b 1
6 c 0
7 c 0
8 c 1
我想知道是否可以直接编写这样的代码:
library(dplyr)
summary_dplyr = data.frame(dplyr_data %>% group_by(col1) %>% dplyr::summarise(mean_count = mean(col1, na.rm = TRUE), special_count = count(1 - nrow(dplyr_data))))
此returns以下错误:
Error: Problem with `summarise()` input `special_count`.
x no applicable method for 'group_vars' applied to an object of class "c('double', 'numeric')"
i Input `special_count` is `count(1 - nrow(dplyr_data))`.
i The error occurred in group 1: col1 = "a".
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning messages:
1: In mean.default(col1, na.rm = TRUE) :
argument is not numeric or logical: returning NA
2: In mean.default(col1, na.rm = TRUE) :
argument is not numeric or logical: returning NA
3: In mean.default(col1, na.rm = TRUE) :
argument is not numeric or logical: returning NA
我正在尝试获得以下输出:
col1 mean_count special_count
1 a 0.66 3-1 = 2
2 b 0.50 2-1 = 1
3 c 0.33 3-2 = 1
基本上,“special_count”= 对于每个唯一的 col_1 组(即 a、b、c):取总行数并减去 0 的数目。
有人可以告诉我怎么做吗?
谢谢
您不能在 summarize
中使用 count()
,但您可以使用带有布尔值的 sum()
来计算值。 sum(col2==0)
会告诉你带 0 的行数,n()
给出总行数(每组)
dplyr_data %>%
group_by(col1) %>%
summarize(mean_count=mean(col2),
special_count = n() - sum(col2==0))
我正在使用 R 中的 dplyr 库。我有以下数据:
col1 = as.factor(c("a", "a", "a", "b", "b", "c", "c", "c"))
col2 = c(1,1,0,0,1, 0, 0, 1)
dplyr_data = data.frame(col1, col2)
head(dplyr_data)
col1 col2
1 a 1
2 a 1
3 a 0
4 b 0
5 b 1
6 c 0
7 c 0
8 c 1
我想知道是否可以直接编写这样的代码:
library(dplyr)
summary_dplyr = data.frame(dplyr_data %>% group_by(col1) %>% dplyr::summarise(mean_count = mean(col1, na.rm = TRUE), special_count = count(1 - nrow(dplyr_data))))
此returns以下错误:
Error: Problem with `summarise()` input `special_count`.
x no applicable method for 'group_vars' applied to an object of class "c('double', 'numeric')"
i Input `special_count` is `count(1 - nrow(dplyr_data))`.
i The error occurred in group 1: col1 = "a".
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning messages:
1: In mean.default(col1, na.rm = TRUE) :
argument is not numeric or logical: returning NA
2: In mean.default(col1, na.rm = TRUE) :
argument is not numeric or logical: returning NA
3: In mean.default(col1, na.rm = TRUE) :
argument is not numeric or logical: returning NA
我正在尝试获得以下输出:
col1 mean_count special_count
1 a 0.66 3-1 = 2
2 b 0.50 2-1 = 1
3 c 0.33 3-2 = 1
基本上,“special_count”= 对于每个唯一的 col_1 组(即 a、b、c):取总行数并减去 0 的数目。
有人可以告诉我怎么做吗?
谢谢
您不能在 summarize
中使用 count()
,但您可以使用带有布尔值的 sum()
来计算值。 sum(col2==0)
会告诉你带 0 的行数,n()
给出总行数(每组)
dplyr_data %>%
group_by(col1) %>%
summarize(mean_count=mean(col2),
special_count = n() - sum(col2==0))