R 使用来自更新的先前值的信息更新分组 df 中的值

R update values within a grouped df with information from updated previous value

我想根据此函数在不同时间点 (timepoint) 有条件地改变组 (id) 内的变量 (var1, var2),使用以前的 updated/muated 值:

change_function <- function(value,pastvalue,timepoint){
  if(timepoint==1){valuenew=value} else
    if(value==0){valuenew=pastvalue-1}
    if(value==1){valuenew=pastvalue}
    if(value==2){valuenew=pastvalue+1}
  return(valuenew)
  }

pastvalueMUTATED/UPDATEDtimepoint -1 for timepoint 2:4

这是一个示例和输出文件:

``` r
#example data
df <- data.frame(id=c(1,1,1,1,2,2,2,2),timepoint=c(1,2,3,4,1,2,3,4),var1=c(1,0,1,2,2,2,1,0),var2=c(2,0,1,2,3,2,1,0))
df
#>   id timepoint var1 var2
#> 1  1         1    1    2
#> 2  1         2    0    0
#> 3  1         3    1    1
#> 4  1         4    2    2
#> 5  2         1    2    3
#> 6  2         2    2    2
#> 7  2         3    1    1
#> 8  2         4    0    0

#desired output
output <- data.frame(id=c(1,1,1,1,2,2,2,2),timepoint=c(1,2,3,4,1,2,3,4),var1=c(1,0,0,1,2,3,3,2),var2=c(2,1,1,2,3,4,4,3))
output
#>   id timepoint var1 var2
#> 1  1         1    1    2
#> 2  1         2    0    1
#> 3  1         3    0    1
#> 4  1         4    1    2
#> 5  2         1    2    3
#> 6  2         2    3    4
#> 7  2         3    3    4
#> 8  2         4    2    3
```

<sup>Created on 2020-11-23 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0)</sup>

我的方法:使用我的函数 dplyr::mutate_at

library(dplyr)
df %>% 
  group_by(id) %>% 
  mutate_at(.vars=vars(var1,var2),
            .funs=funs(.=change_function(.,dplyr::lag(.),timepoint)))

但是,这不起作用,因为 if/else 未矢量化

更新 1:

使用嵌套的 ifelse 函数无法提供所需的输出,因为它没有使用更新的 pastvalue

change_function <- function(value,pastvalue,timepoint){
  ifelse((timepoint==1),value,
         ifelse((value==0),pastvalue-1,
                ifelse((value==1),pastvalue,
                       ifelse((value==2),pastvalue+1,NA))))
  }

library(dplyr)
df %>% 
  group_by(id) %>% 
  mutate_at(.vars=vars(var1,var2),
            .funs=funs(.=change_function(.,dplyr::lag(.),timepoint)))

     id TimePoint  var1  var2 var1_. var2_.
  <dbl>     <dbl> <dbl> <dbl>  <dbl>  <dbl>
1     1         1     1     2      1      2
2     1         2     0     0      0      1
3     1         3     1     1      0      0
4     1         4     2     2      2      2
5     2         1     2     3      2      3
6     2         2     2     2      3      4
7     2         3     1     1      2      2
8     2         4     0     0      0      0

更新二:

根据评论,可以使用purrr:accumulate

感谢 akrun,我得到了正确的函数:

# write a vectorized function
change_function <- function(prev, new) {
  change=if_else(new==0,-1,
          if_else(new==1,0,1))
  if_else(is.na(new), new, prev + change)
}

# use purrr:accumulate 
df %>%
  group_by(id) %>% 
  mutate_at(.vars=vars(var1,var2),
            .funs=funs(accumulate(.,change_function)))

# A tibble: 8 x 4
# Groups:   id [2]
     id timepoint  var1  var2
  <dbl>     <dbl> <dbl> <dbl>
1     1         1     1     2
2     1         2     0     1
3     1         3     0     1
4     1         4     1     2
5     2         1     2     3
6     2         2     3     4
7     2         3     3     4
8     2         4     2     3