如何将数据 table 中每列的每行应用一个函数,并将其他行作为输入?

How to apply a function per row of a column in a data table with other rows as input?

对于列 "Response" 的每一行,我想检查它下面的 5 行是否有 "Response" 值(即没有 NA),如果有,那么我想计算下面这 5 行的平均值和标准偏差。如果下面那 5 行中的任何一行缺少 "Response"-值(即 NA),那么最终输出应该是 "NA"(因为我希望计算 n= 的均值和标准差5 points/values).

Input.data 的示例如下所示:

 Response     
        NA               
         1                 
         2                 
         3                
        NA        
         1         
         1         
         2         
         3         
         4         
         5    

这是我试过的代码,没有给出正确的解决方案:

Input.data$count.lag <- rollapplyr(Input.data[,c("Response")],list(-(4:0)),length, fill=NA)

Input.data$stdev <- ifelse(Input.data$count.lag <5, "NA", 
                            rollapplyr(Input.data[,c("Response")],list(-(4:0)),sd,fill=NA))
Input.data$mean <- ifelse(Input.data$count.lag <5, "NA", 
                           rollapplyr(Input.data[,c("Response")],list(-(4:0)),mean,fill=NA))

它给出了以下内容,这不是我想要的:

 Response count.lag     stdev mean
       NA        NA        NA   NA
        1        NA        NA   NA
        2        NA        NA   NA
        3        NA        NA   NA
       NA         5        NA   NA
        1         5        NA   NA
        1         5        NA   NA
        2         5        NA   NA
        3         5        NA   NA
        4         5  1.303840  2.2
        5         5  1.581139  3.0

输出应该是这样的:

Response count.lag      stdev  mean
     NA         4        NA    NA
      1         4        NA    NA
      2         4        NA    NA
      3         4        NA    NA
     NA         5   1.303840   2.2
      1         5   1.581139   3.0
      1         5   1.581139   4.0
      2         5   1.581139   5.0
      3         5   1.581139   6.0
      4         5   1.581139   7.0
      5         5   1.581139   8.0

有人可以建议错误所在 and/or 可行的替代解决方案吗?谢谢!

可能的方法:

Input[, c("count.lag","stdev","mean") := 
    transpose(lapply(1L:.N, function(n) {
        x <- Response[(n+1L):min(n+5L, .N)]
        c(sum(!is.na(x)), sd(x), mean(x))
    }))]

输出:

    Response count.lag     stdev mean
 1:       NA         4        NA   NA
 2:        1         4        NA   NA
 3:        2         4        NA   NA
 4:        3         4        NA   NA
 5:       NA         5 1.3038405  2.2
 6:        1         5 1.5811388  3.0
 7:        1         5 1.5811388  4.0
 8:        2         5 1.5811388  5.0
 9:        3         5 1.5811388  6.0
10:        4         5 1.5811388  7.0
11:        5         5 1.5811388  8.0
12:        6         4 1.2909944  8.5
13:        7         3 1.0000000  9.0
14:        8         2 0.7071068  9.5
15:        9         1        NA 10.0
16:       10         1        NA   NA

数据:

Input <- fread("Response     
NA               
1                 
2                 
3                
NA        
1         
1         
2         
3         
4         
5
6
7
8
9
10")

编辑:或根据 MichaelChirico 的建议使用 shift。结束值不同,取决于 OP 希望如何处理结束值。

#requires data.table version >= 1.12.0 to use negative shifts (else use type='lag' with positive integers
Input[, c("count.lag", "stdev", "mean") := 
    .SD[, shift(Response, -1L:-5L)][, 
        .(apply(.SD, 1L, function(x) sum(!is.na(x))), 
            apply(.SD, 1L, sd), 
            apply(.SD, 1L, mean))]
]

输出:

    Response count.lag    stdev mean
 1:       NA         4       NA   NA
 2:        1         4       NA   NA
 3:        2         4       NA   NA
 4:        3         4       NA   NA
 5:       NA         5 1.303840  2.2
 6:        1         5 1.581139  3.0
 7:        1         5 1.581139  4.0
 8:        2         5 1.581139  5.0
 9:        3         5 1.581139  6.0
10:        4         5 1.581139  7.0
11:        5         5 1.581139  8.0
12:        6         4       NA   NA
13:        7         3       NA   NA
14:        8         2       NA   NA
15:        9         1       NA   NA
16:       10         0       NA   NA