使用已知信息按顺序执行计算并在 R 中按组填充未来值

Use known information to sequentially perform calculations and populate future values by group in R

我有这样的数据: intial<-tibble(start_date=rep(seq.Date(as.Date("2021-06-01"),as.Date("2021-10-01"),by="months"),4),end_date=rep(seq.Date(as.Date("2021-07-01"),as.Date("2021-11-01"),by="months"),4),id=rep(c(rep(1,5),rep(2,5)),2),group=c(rep("a",10),rep("b",10)),increase=c(c(4:8),c(5:9),c(6:10),c(7:11)),decrease=c(c(1:5),c(2:6),c(3:7),c(4:8)),start_count=c(c(10,13,16,19,NA),c(15,18,21,24,NA),c(20,23,26,29,NA),c(20,23,26,29,NA)),end_count=c(c(13,16,19,NA,NA),c(18,21,24,NA,NA),c(23,26,29,NA,NA),c(23,26,29,NA,NA))) print(initial)

  start_date end_date      id group increase decrease start_count end_count
   <date>     <date>     <dbl> <chr>    <int>    <int>       <dbl>     <dbl>
 1 2021-06-01 2021-07-01     1 a            4        1          10        13
 2 2021-07-01 2021-08-01     1 a            5        2          13        16
 3 2021-08-01 2021-09-01     1 a            6        3          16        19
 4 2021-09-01 2021-10-01     1 a            7        4          19        NA
 5 2021-10-01 2021-11-01     1 a            8        5          NA        NA
 6 2021-06-01 2021-07-01     2 a            5        2          15        18
 7 2021-07-01 2021-08-01     2 a            6        3          18        21
 8 2021-08-01 2021-09-01     2 a            7        4          21        24
 9 2021-09-01 2021-10-01     2 a            8        5          24        NA
10 2021-10-01 2021-11-01     2 a            9        6          NA        NA
11 2021-06-01 2021-07-01     1 b            6        3          20        23
12 2021-07-01 2021-08-01     1 b            7        4          23        26
13 2021-08-01 2021-09-01     1 b            8        5          26        29
14 2021-09-01 2021-10-01     1 b            9        6          29        NA
15 2021-10-01 2021-11-01     1 b           10        7          NA        NA
16 2021-06-01 2021-07-01     2 b            7        4          20        23
17 2021-07-01 2021-08-01     2 b            8        5          23        26
18 2021-08-01 2021-09-01     2 b            9        6          26        29
19 2021-09-01 2021-10-01     2 b           10        7          29        NA
20 2021-10-01 2021-11-01     2 b           11        8          NA        NA

其中 idgroup 的每个唯一组合都包含一个日期索引。我需要使用 increasedecrease 字段中的值来计算并在每个 id 和 [=15] 中的每个日期按顺序填充 start_countend_count =] 组合.

end_count 的计算公式为:start_count + increase - decrease,而 start_count 是前一个日期的 end_count.

基本上,我想得到如下所示的内容 example: final<-tibble(start_date=rep(seq.Date(as.Date("2021-06-01"),as.Date("2021-10-01"),by="months"),4),end_date=rep(seq.Date(as.Date("2021-07-01"),as.Date("2021-11-01"),by="months"),4),id=rep(c(rep(1,5),rep(2,5)),2),group=c(rep("a",10),rep("b",10)),increase=c(c(4:8),c(5:9),c(6:10),c(7:11)),decrease=c(c(1:5),c(2:6),c(3:7),c(4:8)),start_count=c(c(10,13,16,19,21),c(15,18,21,24,27),c(20,23,26,29,32),c(20,23,26,29,31)),end_count=c(c(13,16,19,21,24),c(18,21,24,27,30),c(23,26,29,32,35),c(23,26,29,31,34)))

print(final)
 start_date end_date      id group increase decrease start_count end_count
   <date>     <date>     <dbl> <chr>    <int>    <int>       <dbl>     <dbl>
 1 2021-06-01 2021-07-01     1 a            4        1          10        13
 2 2021-07-01 2021-08-01     1 a            5        2          13        16
 3 2021-08-01 2021-09-01     1 a            6        3          16        19
 4 2021-09-01 2021-10-01     1 a            7        4          19        21
 5 2021-10-01 2021-11-01     1 a            8        5          21        24
 6 2021-06-01 2021-07-01     2 a            5        2          15        18
 7 2021-07-01 2021-08-01     2 a            6        3          18        21
 8 2021-08-01 2021-09-01     2 a            7        4          21        24
 9 2021-09-01 2021-10-01     2 a            8        5          24        27
10 2021-10-01 2021-11-01     2 a            9        6          27        30
11 2021-06-01 2021-07-01     1 b            6        3          20        23
12 2021-07-01 2021-08-01     1 b            7        4          23        26
13 2021-08-01 2021-09-01     1 b            8        5          26        29
14 2021-09-01 2021-10-01     1 b            9        6          29        32
15 2021-10-01 2021-11-01     1 b           10        7          32        35
16 2021-06-01 2021-07-01     2 b            7        4          20        23
17 2021-07-01 2021-08-01     2 b            8        5          23        26
18 2021-08-01 2021-09-01     2 b            9        6          26        29
19 2021-09-01 2021-10-01     2 b           10        7          29        31
20 2021-10-01 2021-11-01     2 b           11        8          31        34

我们可以使用

library(dplyr)
library(zoo)
intial %>% 
   group_by(id, group) %>% 
   mutate(start_count_new = coalesce(start_count, 
       zoo::na.locf0(start_count) + increase - decrease), 
      end_count_new = coalesce(lead(start_count_new), 
        lead(start_count_new, default = last(start_count_new)) + increase - decrease))