dplyr 组的滞后差异

lagged difference by group with dplyr

我有以下数据集

   Amount1 Amount2       Date Group
1       NA     350 2019-01-01     A
2       NA     335 2019-01-01     B
3       NA     340 2019-01-01     C
4      300     365 2019-01-06     A
5      310     325 2019-01-06     B
6      285     355 2019-01-06     C
7      310     335 2019-01-11     A
8      305     355 2019-01-11     B
9      335     360 2019-01-11     C
10     280      NA 2019-01-16     A
11     290      NA 2019-01-16     B
12     240      NA 2019-01-16     C

你可以用这个重新创建

> dput(test) 

structure(list(Amount1 = c(NA, NA, NA, 300, 310, 285, 310, 305, 335, 280, 290, 240), 
Amount2 = c(350, 335, 340, 365, 325, 355,  335, 355, 360, NA, NA, NA), 
Date = structure(c(1L, 1L, 1L, 2L,  2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L), .Label = c("2019-01-01", "2019-01-06",  "2019-01-11", "2019-01-16"), class = "factor"), 
Group = structure(c(1L,  2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("A",  "B", "C"), class = "factor")), 
row.names = c(NA, -12L), class = "data.frame")

我想为每个组从前一个 Amount2 中减去 Amount1

例如,对于 A 组,我有:

2019-01-01 -> NA
2019-01-06 -> 350 - 300 = 50
2019-01-11 -> 365 - 310 = 55
2019-01-16 -> 335 - 280 = 55

我该怎么做?我尝试使用 mutate_at 但没有成功...

# Does not work...
test %>%
  group_by(Group, Amount2) %>%
  mutate_at(c("Amount1"), funs(AmountDiff = . - lag(Amount2, 1)))

这个怎么样?

test %>% 
  group_by(Group) %>% 
  mutate(Amount_diff = lag(Amount2) - Amount1)

即:

A tibble: 12 x 5
# Groups:   Group [3]
   Amount1 Amount2 Date       Group Amount_diff
     <dbl>   <dbl> <fct>      <fct>       <dbl>
 1      NA     350 2019-01-01 A              NA
 2      NA     335 2019-01-01 B              NA
 3      NA     340 2019-01-01 C              NA
 4     300     365 2019-01-06 A              50
 5     310     325 2019-01-06 B              25
 6     285     355 2019-01-06 C              55
 7     310     335 2019-01-11 A              55
 8     305     355 2019-01-11 B              20
 9     335     360 2019-01-11 C              20
10     280      NA 2019-01-16 A              55
11     290      NA 2019-01-16 B              65
12     240      NA 2019-01-16 C             120

对于 A 组:

test %>% 
  group_by(Group) %>% 
  mutate(Amount_diff = lag(Amount2) - Amount1) %>% 
  filter(Group == "A")

是:

# A tibble: 4 x 5
# Groups:   Group [1]
  Amount1 Amount2 Date       Group Amount_diff
    <dbl>   <dbl> <fct>      <fct>       <dbl>
1      NA     350 2019-01-01 A              NA
2     300     365 2019-01-06 A              50
3     310     335 2019-01-11 A              55
4     280      NA 2019-01-16 A              55