na.approx 仅当一列中连续少于 3 个 NA 时

Question

mydata <-data.frame(group = c(1,1,1,1,1,2,2,2,2,2), score = c(10, NA, NA, 20, 30, 5, NA, NA, NA, 40))

来自 'mydata' 我正在尝试使用 dplyr 使用 na.approx 插入 'x' 当最近的非 NA 条目之间的连续 NA 少于 3 个时'value'。内插的 x 值存储在 'x_approx'.

中

没有'value'中连续NA数量的条件我使用这个代码：

library(zoo)
mydata %>%
     group_by(group) %>%
     mutate(score_approx = na.approx(score)) %>%
     mutate(score_approx = coalesce(score_approx,score))

mydata
# A tibble: 10 x 3
# Groups:   group [2]
   group score score_approx
   <dbl> <dbl>        <dbl>
 1     1    10         10  
 2     1    NA         13.3
 3     1    NA         16.7
 4     1    20         20  
 5     1    30         30  
 6     2     5          5  
 7     2    NA         13.8
 8     2    NA         22.5
 9     2    NA         31.2
10     2    40         40

然而，想要的数据框是：

# A tibble: 10 x 3
# Groups:   group [2]
   group score score_approx
   <dbl> <dbl>        <dbl>
 1     1    10         10  
 2     1    NA         13.3
 3     1    NA         16.7
 4     1    20         20  
 5     1    30         30  
 6     2     5          5  
 7     2    NA         NA
 8     2    NA         NA
 9     2    NA         NA
10     2    40         40

Answer 1

您可以在 na.approx -

中使用 maxgap 参数

library(dplyr)
library(zoo)

mydata %>%
  group_by(group) %>%
  mutate(score_approx = na.approx(score, maxgap = 2)) %>%
  ungroup

#   group score score_approx
#   <dbl> <dbl>        <dbl>
# 1     1    10         10  
# 2     1    NA         13.3
# 3     1    NA         16.7
# 4     1    20         20  
# 5     1    30         30  
# 6     2     5          5  
# 7     2    NA         NA  
# 8     2    NA         NA  
# 9     2    NA         NA  
#10     2    40         40

na.approx 仅当一列中连续少于 3 个 NA 时

na.approx only when less than 3 consecutive NA in a column

r

zoo

dplyr