查找累积和然后平均 R 中的值
Finding cumulative sum and then average the values in R
我想计算第一个 (n-1) columns
的累积总和(如果我们有 n
列矩阵),然后计算这些值的平均值。我创建了一个样本矩阵来完成这项任务。我有以下矩阵
ma = matrix(c(1:10), nrow = 2, ncol = 5)
ma
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
我想找到以下内容
ans = matrix(c(1,2,2,3,3,4,4,5), nrow = 2, ncol = 4)
ans
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
以下是我的r
函数。
ColCumSumsAve <- function(y){
for(i in seq_len(dim(y)[2]-1)) {
y[,i] <- cumsum(y[,i])/i
}
}
ColCumSumsAve(ma)
但是,当我 运行 上面的函数没有产生任何输出时。代码有没有错误?
谢谢。
这是我的做法
> t(apply(ma, 1, function(x) cumsum(x) / 1:length(x)))[,-NCOL(ma)]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
这会将 cumsum
函数逐行应用于矩阵 ma
,然后除以正确的长度以获得平均值(cumsum(x)
和 1:length(x)
将具有一样长)。然后简单地用 t
转置并用 [,-NCOL(ma)]
.
删除最后一列
你的函数没有输出的原因是你没有返回任何东西。您应该按照 Marius 的建议以 return(y)
或 y
结束函数。无论如何,您的函数似乎并没有给您正确的响应。
k <- t(apply(ma,1,cumsum))[,-ncol(k)]
for (i in 1:ncol(k)){
k[,i] <- k[,i]/i
}
k
这应该有效。
有几个错误。
解决方案
这是我测试过的有效方法:
colCumSumAve <- function(m) {
csum <- t(apply(X=m, MARGIN=1, FUN=cumsum))
res <- t(Reduce(`/`, list(t(csum), 1:ncol(m))))
res[, 1:(ncol(m)-1)]
}
测试:
> colCumSumAve(ma)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
这是正确的。
解释:
colCumSumAve <- function(m) {
csum <- t(apply(X=m, MARGIN=1, FUN=cumsum)) # calculate row-wise colsum
res <- t(Reduce(`/`, list(t(csum), 1:ncol(m))))
# This is the trickiest part.
# Because `csum` is a matrix, the matrix will be treated like a vector
# when `Reduce`-ing using `/` with a vector `1:ncol(m)`.
# To get quasi-row-wise treatment, I change orientation
# of the matrix by `t()`.
# However, the output, the output will be in this transformed
# orientation as a consequence. So I re-transform by applying `t()`
# on the entire result at the end - to get again the original
# input matrix orientation.
# `Reduce` using `/` here by sequencial list of the `t(csum)` and
# `1:ncol(m)` finally, has as effect `/`-ing `csum` values by their
# corresponding column position.
res[, 1:(ncol(m)-1)] # removes last column for the answer.
# this, of course could be done right at the beginning,
# saving calculation of values in the last column,
# but this calculation actually is not the speed-limiting or speed-down-slowing step
# of these calculations (since this is sth vectorized)
# rather the `apply` and `Reduce` will be rather speed-limiting.
}
好吧,那我可以做:
colCumSumAve <- function(m) {
csum <- t(apply(X=m[, 1:(ncol(m)-1)], MARGIN=1, FUN=cumsum))
t(Reduce(`/`, list(t(csum), 1:ncol(m))))
}
或:
colCumSumAve <- function(m) {
m <- m[, 1:(ncol(m)-1)] # remove last column
csum <- t(apply(X=m, MARGIN=1, FUN=cumsum))
t(Reduce(`/`, list(t(csum), 1:ncol(m))))
}
这实际上是更优化的解决方案。
原函数
您的原始函数仅在 for
循环中进行赋值,而不会 return 任何内容。
所以我首先将你的输入复制到 res
,用你的 for
循环处理它,然后 returned res
.
ColCumSumsAve <- function(y){
res <- y
for(i in seq_len(dim(y)[2]-1)) {
res[,i] <- cumsum(y[,i])/i
}
res
}
但是,这给出了:
> ColCumSumsAve(ma)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1.5 1.666667 1.75 9
[2,] 3 3.5 3.666667 3.75 10
问题是矩阵中的 cumsum
是按列方向计算的,而不是按行计算的,因为它将矩阵视为向量(按列方向通过矩阵)。
修正原函数
经过一番折腾,我明白了,正确的解法是:
ColCumSumsAve <- function(y){
res <- matrix(NA, nrow(y), ncol(y)-1)
# create empty matrix with the dimensions of y minus last column
for (i in 1:(nrow(y))) { # go through rows
for (j in 1:(ncol(y)-1)) { # go through columns
res[i, j] <- sum(y[i, 1:j])/j # for each position do this
}
}
res # return `res`ult by calling it at the end!
}
通过测试:
> ColCumSumsAve(ma)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
注意:dim(y)[2]
是 ncol(y)
- 而 dim(y)[1]
是 nrow(y)
-
而 seq_len()
、1:
更短,我想甚至更快。
注意:我首先给出的解决方案会更快,因为它使用 apply
、矢量化 cumsum
和 Reduce
。 - for
-R 中的循环较慢。
后期注意:不确定第一个解决方案是否更快。由于 R-3.x 似乎 for
循环更快。 Reduce
将是限速功能,有时会非常慢。
你只需要 rowMeans
:
nc <- 4
cbind(ma[,1],sapply(2:nc,function(x) rowMeans(ma[,1:x])))
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
我想计算第一个 (n-1) columns
的累积总和(如果我们有 n
列矩阵),然后计算这些值的平均值。我创建了一个样本矩阵来完成这项任务。我有以下矩阵
ma = matrix(c(1:10), nrow = 2, ncol = 5)
ma
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
我想找到以下内容
ans = matrix(c(1,2,2,3,3,4,4,5), nrow = 2, ncol = 4)
ans
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
以下是我的r
函数。
ColCumSumsAve <- function(y){
for(i in seq_len(dim(y)[2]-1)) {
y[,i] <- cumsum(y[,i])/i
}
}
ColCumSumsAve(ma)
但是,当我 运行 上面的函数没有产生任何输出时。代码有没有错误?
谢谢。
这是我的做法
> t(apply(ma, 1, function(x) cumsum(x) / 1:length(x)))[,-NCOL(ma)]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
这会将 cumsum
函数逐行应用于矩阵 ma
,然后除以正确的长度以获得平均值(cumsum(x)
和 1:length(x)
将具有一样长)。然后简单地用 t
转置并用 [,-NCOL(ma)]
.
你的函数没有输出的原因是你没有返回任何东西。您应该按照 Marius 的建议以 return(y)
或 y
结束函数。无论如何,您的函数似乎并没有给您正确的响应。
k <- t(apply(ma,1,cumsum))[,-ncol(k)]
for (i in 1:ncol(k)){
k[,i] <- k[,i]/i
}
k
这应该有效。
有几个错误。
解决方案
这是我测试过的有效方法:
colCumSumAve <- function(m) {
csum <- t(apply(X=m, MARGIN=1, FUN=cumsum))
res <- t(Reduce(`/`, list(t(csum), 1:ncol(m))))
res[, 1:(ncol(m)-1)]
}
测试:
> colCumSumAve(ma)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
这是正确的。
解释:
colCumSumAve <- function(m) {
csum <- t(apply(X=m, MARGIN=1, FUN=cumsum)) # calculate row-wise colsum
res <- t(Reduce(`/`, list(t(csum), 1:ncol(m))))
# This is the trickiest part.
# Because `csum` is a matrix, the matrix will be treated like a vector
# when `Reduce`-ing using `/` with a vector `1:ncol(m)`.
# To get quasi-row-wise treatment, I change orientation
# of the matrix by `t()`.
# However, the output, the output will be in this transformed
# orientation as a consequence. So I re-transform by applying `t()`
# on the entire result at the end - to get again the original
# input matrix orientation.
# `Reduce` using `/` here by sequencial list of the `t(csum)` and
# `1:ncol(m)` finally, has as effect `/`-ing `csum` values by their
# corresponding column position.
res[, 1:(ncol(m)-1)] # removes last column for the answer.
# this, of course could be done right at the beginning,
# saving calculation of values in the last column,
# but this calculation actually is not the speed-limiting or speed-down-slowing step
# of these calculations (since this is sth vectorized)
# rather the `apply` and `Reduce` will be rather speed-limiting.
}
好吧,那我可以做:
colCumSumAve <- function(m) {
csum <- t(apply(X=m[, 1:(ncol(m)-1)], MARGIN=1, FUN=cumsum))
t(Reduce(`/`, list(t(csum), 1:ncol(m))))
}
或:
colCumSumAve <- function(m) {
m <- m[, 1:(ncol(m)-1)] # remove last column
csum <- t(apply(X=m, MARGIN=1, FUN=cumsum))
t(Reduce(`/`, list(t(csum), 1:ncol(m))))
}
这实际上是更优化的解决方案。
原函数
您的原始函数仅在 for
循环中进行赋值,而不会 return 任何内容。
所以我首先将你的输入复制到 res
,用你的 for
循环处理它,然后 returned res
.
ColCumSumsAve <- function(y){
res <- y
for(i in seq_len(dim(y)[2]-1)) {
res[,i] <- cumsum(y[,i])/i
}
res
}
但是,这给出了:
> ColCumSumsAve(ma)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1.5 1.666667 1.75 9
[2,] 3 3.5 3.666667 3.75 10
问题是矩阵中的 cumsum
是按列方向计算的,而不是按行计算的,因为它将矩阵视为向量(按列方向通过矩阵)。
修正原函数
经过一番折腾,我明白了,正确的解法是:
ColCumSumsAve <- function(y){
res <- matrix(NA, nrow(y), ncol(y)-1)
# create empty matrix with the dimensions of y minus last column
for (i in 1:(nrow(y))) { # go through rows
for (j in 1:(ncol(y)-1)) { # go through columns
res[i, j] <- sum(y[i, 1:j])/j # for each position do this
}
}
res # return `res`ult by calling it at the end!
}
通过测试:
> ColCumSumsAve(ma)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
注意:dim(y)[2]
是 ncol(y)
- 而 dim(y)[1]
是 nrow(y)
-
而 seq_len()
、1:
更短,我想甚至更快。
注意:我首先给出的解决方案会更快,因为它使用 apply
、矢量化 cumsum
和 Reduce
。 - for
-R 中的循环较慢。
后期注意:不确定第一个解决方案是否更快。由于 R-3.x 似乎 for
循环更快。 Reduce
将是限速功能,有时会非常慢。
你只需要 rowMeans
:
nc <- 4
cbind(ma[,1],sapply(2:nc,function(x) rowMeans(ma[,1:x])))
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5