查找累积和然后平均 R 中的值

Question

我想计算第一个 (n-1) columns 的累积总和（如果我们有 n 列矩阵），然后计算这些值的平均值。我创建了一个样本矩阵来完成这项任务。我有以下矩阵

ma = matrix(c(1:10), nrow = 2, ncol = 5)
ma
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5    7    9
[2,]    2    4    6    8   10

我想找到以下内容

ans = matrix(c(1,2,2,3,3,4,4,5), nrow = 2, ncol = 4)
ans
     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    2    3    4    5

以下是我的r函数。

ColCumSumsAve <- function(y){
  for(i in seq_len(dim(y)[2]-1)) {
    y[,i] <- cumsum(y[,i])/i
  }
}
ColCumSumsAve(ma)

但是，当我运行上面的函数没有产生任何输出时。代码有没有错误？

谢谢。

Answer 1

这是我的做法

> t(apply(ma, 1, function(x) cumsum(x) / 1:length(x)))[,-NCOL(ma)]
     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    2    3    4    5

这会将 cumsum 函数逐行应用于矩阵 ma，然后除以正确的长度以获得平均值（cumsum(x) 和 1:length(x) 将具有一样长）。然后简单地用 t 转置并用 [,-NCOL(ma)].

删除最后一列

你的函数没有输出的原因是你没有返回任何东西。您应该按照 Marius 的建议以 return(y) 或 y 结束函数。无论如何，您的函数似乎并没有给您正确的响应。

Answer 2

k <- t(apply(ma,1,cumsum))[,-ncol(k)]
for (i in 1:ncol(k)){
  k[,i] <- k[,i]/i
}
k

这应该有效。

Answer 3

有几个错误。

解决方案

这是我测试过的有效方法：

colCumSumAve <- function(m) {
  csum <- t(apply(X=m, MARGIN=1, FUN=cumsum))
  res <- t(Reduce(`/`, list(t(csum), 1:ncol(m))))
  res[, 1:(ncol(m)-1)]
}

测试：

> colCumSumAve(ma)
     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    2    3    4    5

这是正确的。

解释：

colCumSumAve <- function(m) {
  csum <- t(apply(X=m, MARGIN=1, FUN=cumsum)) # calculate row-wise colsum
  res <- t(Reduce(`/`, list(t(csum), 1:ncol(m))))
  # This is the trickiest part.
  # Because `csum` is a matrix, the matrix will be treated like a vector 
  # when `Reduce`-ing using `/` with a vector `1:ncol(m)`.
  # To get quasi-row-wise treatment, I change orientation
  # of the matrix by `t()`. 
  # However, the output, the output will be in this transformed
  # orientation as a consequence. So I re-transform by applying `t()`
  # on the entire result at the end - to get again the original
  # input matrix orientation.
  # `Reduce` using `/` here by sequencial list of the `t(csum)` and
  # `1:ncol(m)` finally, has as effect `/`-ing `csum` values by their
  # corresponding column position.
  res[, 1:(ncol(m)-1)] # removes last column for the answer.
  # this, of course could be done right at the beginning,
  # saving calculation of values in the last column,
  # but this calculation actually is not the speed-limiting or speed-down-slowing step
  # of these calculations (since this is sth vectorized)
  # rather the `apply` and `Reduce` will be rather speed-limiting.
}

好吧，那我可以做:

colCumSumAve <- function(m) {
  csum <- t(apply(X=m[, 1:(ncol(m)-1)], MARGIN=1, FUN=cumsum))
  t(Reduce(`/`, list(t(csum), 1:ncol(m))))
}

或：

colCumSumAve <- function(m) {
  m <- m[, 1:(ncol(m)-1)] # remove last column
  csum <- t(apply(X=m, MARGIN=1, FUN=cumsum))
  t(Reduce(`/`, list(t(csum), 1:ncol(m))))
}

这实际上是更优化的解决方案。

原函数

您的原始函数仅在 for 循环中进行赋值，而不会 return 任何内容。所以我首先将你的输入复制到 res，用你的 for 循环处理它，然后 returned res.

ColCumSumsAve <- function(y){
  res <- y
  for(i in seq_len(dim(y)[2]-1)) {
    res[,i] <- cumsum(y[,i])/i
  }
  res
}

但是，这给出了：

> ColCumSumsAve(ma)
     [,1] [,2]     [,3] [,4] [,5]
[1,]    1  1.5 1.666667 1.75    9
[2,]    3  3.5 3.666667 3.75   10

问题是矩阵中的 cumsum 是按列方向计算的，而不是按行计算的，因为它将矩阵视为向量（按列方向通过矩阵）。

修正原函数

经过一番折腾，我明白了，正确的解法是：

ColCumSumsAve <- function(y){
  res <- matrix(NA, nrow(y), ncol(y)-1) 
  # create empty matrix with the dimensions of y minus last column
  for (i in 1:(nrow(y))) {           # go through rows
    for (j in 1:(ncol(y)-1)) {       # go through columns
      res[i, j] <- sum(y[i, 1:j])/j  # for each position do this
    }
  }
  res   # return `res`ult by calling it at the end!
}

通过测试：

> ColCumSumsAve(ma)
     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    2    3    4    5

注意：dim(y)[2] 是 ncol(y) - 而 dim(y)[1] 是 nrow(y) - 而 seq_len()、1: 更短，我想甚至更快。

注意：我首先给出的解决方案会更快，因为它使用 apply、矢量化 cumsum 和 Reduce。 - for-R 中的循环较慢。

后期注意：不确定第一个解决方案是否更快。由于 R-3.x 似乎 for 循环更快。 Reduce 将是限速功能，有时会非常慢。

Answer 4

你只需要 rowMeans:

nc <- 4
cbind(ma[,1],sapply(2:nc,function(x) rowMeans(ma[,1:x])))
     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    2    3    4    5

查找累积和然后平均 R 中的值

Finding cumulative sum and then average the values in R

r

matrix

cumsum