R：通过 R 中跨列的分类变量等效于 Sumif 和 Countif

Question

假设我有一个包含 10 列的数据集。其中 9 个是数字，一个是分类值，如 HIgh Medium 和 Low。我想通过 R 中所有 9 个数字列的分类变量进行总结（类似于 excel 中的 sumif 和 countif）

如何做到这一点？我是 R 的新手，任何帮助都会很棒！谢谢！

Answer 1

如果你的数据框被称为 df 并且你的分类变量被称为 group.var，那么你可以这样做：

library(dplyr)

df %>% group_by(group.var) %>%
   summarise_each(funs(n(),sum))

带有内置 iris 数据框的示例：

iris %>% group_by(Species) %>%
  summarise_each(funs(n(), sum))

     Species Sepal.Length_n Sepal.Width_n Petal.Length_n Petal.Width_n Sepal.Length_sum Sepal.Width_sum Petal.Length_sum Petal.Width_sum
      (fctr)          (int)         (int)          (int)         (int)            (dbl)           (dbl)            (dbl)           (dbl)
1     setosa             50            50             50            50            250.3           171.4             73.1            12.3
2 versicolor             50            50             50            50            296.8           138.5            213.0            66.3
3  virginica             50            50             50            50            329.4           148.7            277.6           101.3

还有许多其他选项（例如，data.table 包，以及使用 tapply、aggregate 等的基础 R 解决方案）

Answer 2

在继续处理令人眼花缭乱的包（尽管它们可能很有用）之前，了解这些类型操作的基本 R 习语会有所帮助。

by(iris, iris$Species, summary)

将拆分一个 data.frame 并对每个子集应用一个函数。如果您需要对矢量而不是 data.frame 进行操作，请参阅 ?tapply。

tapply(iris$Sepal.Length, iris$Species, summary)

R：通过 R 中跨列的分类变量等效于 Sumif 和 Countif

R: Equivalent for Sumif and Countif by Categorical variable across columns in R

group-by

r

countif

sumifs