R:通过多个组对两个变量的滞后求和来填充 NA
R: Fill in NAs by multiple groups summing up lagges of two variables
我尝试根据 N 的滞后和附加列的滞后来填充 N 列的 NA 值。问题是它填充了第二行,但不是全部。
这是我的代码。
```{r}
library(tidyverse)
library(zoo)
# toy data ----
df <- data.frame('Group.1' = c("a", "a", "a"),
'Group.2' = c("A", "A", "A"),
'monthyr' = as.yearmon(c("2018-01-01", "2018-02-01", "2018-03-01")),
'x' = c(5, 7, 8),
'y' = c(10, 18, 9),
'N' = c(100, NA, NA)
)
df$net <- df$x - df$y
df
# what i get ----
df_attempt <- df %>%
group_by(Group.1, Group.2) %>%
arrange(Group.1, Group.2, monthyr) %>%
mutate(N = ifelse(is.na(N), lag(N) + lag(net), N))
# what i want to get ----
df_expected <- data.frame('Group.1' = c("a", "a", "a"),
'Group.2' = c("A", "A", "A"),
'monthyr' = as.yearmon(c("2018-01-01", "2018-02-01", "2018-03-01")),
'x' = c(5, 7, 8),
'y' = c(10, 18, 9),
'N' = c(100, 95, 84)
)
```
一种实现您尝试实现的“递归”填充的方法可能如下所示:
library(tidyverse)
library(zoo)
df %>%
group_by(Group.1, Group.2) %>%
arrange(monthyr) %>%
fill(N) %>%
mutate(N = N + cumsum(lag(net, default = 0)))
#> # A tibble: 3 × 7
#> # Groups: Group.1, Group.2 [1]
#> Group.1 Group.2 monthyr x y N net
#> <chr> <chr> <yearmon> <dbl> <dbl> <dbl> <dbl>
#> 1 a A Jan 2018 5 10 100 -5
#> 2 a A Feb 2018 7 18 95 -11
#> 3 a A Mär 2018 8 9 84 -1
我尝试根据 N 的滞后和附加列的滞后来填充 N 列的 NA 值。问题是它填充了第二行,但不是全部。
这是我的代码。
```{r}
library(tidyverse)
library(zoo)
# toy data ----
df <- data.frame('Group.1' = c("a", "a", "a"),
'Group.2' = c("A", "A", "A"),
'monthyr' = as.yearmon(c("2018-01-01", "2018-02-01", "2018-03-01")),
'x' = c(5, 7, 8),
'y' = c(10, 18, 9),
'N' = c(100, NA, NA)
)
df$net <- df$x - df$y
df
# what i get ----
df_attempt <- df %>%
group_by(Group.1, Group.2) %>%
arrange(Group.1, Group.2, monthyr) %>%
mutate(N = ifelse(is.na(N), lag(N) + lag(net), N))
# what i want to get ----
df_expected <- data.frame('Group.1' = c("a", "a", "a"),
'Group.2' = c("A", "A", "A"),
'monthyr' = as.yearmon(c("2018-01-01", "2018-02-01", "2018-03-01")),
'x' = c(5, 7, 8),
'y' = c(10, 18, 9),
'N' = c(100, 95, 84)
)
```
一种实现您尝试实现的“递归”填充的方法可能如下所示:
library(tidyverse)
library(zoo)
df %>%
group_by(Group.1, Group.2) %>%
arrange(monthyr) %>%
fill(N) %>%
mutate(N = N + cumsum(lag(net, default = 0)))
#> # A tibble: 3 × 7
#> # Groups: Group.1, Group.2 [1]
#> Group.1 Group.2 monthyr x y N net
#> <chr> <chr> <yearmon> <dbl> <dbl> <dbl> <dbl>
#> 1 a A Jan 2018 5 10 100 -5
#> 2 a A Feb 2018 7 18 95 -11
#> 3 a A Mär 2018 8 9 84 -1