如何按行数聚合

How to aggregate by the number of rows

目的是按行数聚合观察结果。

为了说明,示例数据如下所示:

structure(list(observation = c(1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 1)), class = "data.frame", row.names = c(NA, 
-20L), variable.labels = structure(character(0), .Names = character(0)), codepage = 65001L)

视觉上,上面是:

╔═════════════╗
║ observation ║
╠═════════════╣
║      1      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      1      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      1      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      0      ║
╠═════════════╣
║      1      ║
╚═════════════╝

最终目标是根据 1 的计数和平均值按指定的行数(例如,下面示例输出中的 10 行)进行聚合。输出看起来像:

╔═══════╦══════╗
║ count ║ mean ║
╠═══════╬══════╣
║   3   ║  0.3 ║
╠═══════╬══════╣
║   1   ║  0.1 ║
╚═══════╩══════╝

您可以试试下面的代码

do.call(
  rbind,
  tapply(
    df$observation,
    ceiling(seq(nrow(df)) / 10),
    function(x) data.frame(count = sum(x), mean = mean(x))
  )
)

这给出了

  count mean
1     3  0.3
2     1  0.1

一个tidyverse的解决方案。根据 row_number:

的 mod 10 创建分组变量
library(tidyverse)

d %>%
    mutate(rn = cumsum(row_number() %% 10 == 1)) %>%
    group_by(rn) %>%
    summarise(count = sum(observation),
              mean = mean(observation))

     rn count  mean
  <int> <dbl> <dbl>
1     1     3   0.3
2     2     1   0.1

使用data.table

library(data.table)
setDT(df1)[, .(count = sum(observation), mean = mean(observation)),
      .(grp = as.integer(gl(nrow(df1), 10, nrow(df1))))][, grp := NULL][]

-输出

#   count mean
#1:     3  0.3
#2:     1  0.1