如何在 dplyr 中使用 fill_by_function() 和 na.approx() [线性插值]
How to use fill_by_function() with na.approx() [linear interpolation] inside dplyr
我正在查看 padr 的文档:
https://cran.r-project.org/web/packages/padr/vignettes/padr.html
.
稍微更改插图示例以对数据使用线性插值 (zoo::na.approx()
) 会产生错误:
library(tidyverse)
library(padr)
library(zoo)
set.seed(123)
emergency %>%
filter(title == 'EMS: DEHYDRATION') %>%
thicken(interval = 'day') %>%
group_by(time_stamp_day) %>%
summarise(nr = n() + as.integer(runif(1, 1, 999)) ) %>%
pad()
结果:
# A tibble: 307 × 2
time_stamp_day nr
<date> <int>
1 2015-12-12 79
2 2015-12-13 42
3 2015-12-14 NA
4 2015-12-15 NA
5 2015-12-16 NA
6 2015-12-17 NA
7 2015-12-18 88
8 2015-12-19 NA
9 2015-12-20 NA
10 2015-12-21 NA
# ... with 297 more rows
现在我想将 42 线性插值到 88。我认为完成此操作的最佳方法是在内部使用 zoo::na.approx()
padr::fill_by_function()
:
emergency %>%
filter(title == 'EMS: DEHYDRATION') %>%
thicken(interval = 'day') %>%
group_by(time_stamp_day) %>%
summarise(nr = n() + as.integer(runif(1, 1, 99)) ) %>%
pad() %>%
fill_by_function(nr, na.approx)
但我收到以下错误:
Error in inds[i] <- which(colnames_x == as.character(cols[[i]])) :
replacement has length zero
关于如何开始解决此问题的任何想法?
你只需要mutate
做na.approx
:
library(tibble);library(zoo)
emergency <- as_tibble(read.table(text="time_stamp_day nr
1 2015-12-12 79
2 2015-12-13 42
3 2015-12-14 NA
4 2015-12-15 NA
5 2015-12-16 NA
6 2015-12-17 NA
7 2015-12-18 88
8 2015-12-19 NA
9 2015-12-20 NA
10 2015-12-21 NA",header=TRUE,stringsAsFactors=FALSE))
emergency %>% mutate(nr=na.approx(nr,na.rm =FALSE))
# A tibble: 10 × 2
time_stamp_day nr
<chr> <dbl>
1 2015-12-12 79.0
2 2015-12-13 42.0
3 2015-12-14 51.2
4 2015-12-15 60.4
5 2015-12-16 69.6
6 2015-12-17 78.8
7 2015-12-18 88.0
8 2015-12-19 NA
9 2015-12-20 NA
10 2015-12-21 NA
我正在查看 padr 的文档:
https://cran.r-project.org/web/packages/padr/vignettes/padr.html
.
稍微更改插图示例以对数据使用线性插值 (zoo::na.approx()
) 会产生错误:
library(tidyverse)
library(padr)
library(zoo)
set.seed(123)
emergency %>%
filter(title == 'EMS: DEHYDRATION') %>%
thicken(interval = 'day') %>%
group_by(time_stamp_day) %>%
summarise(nr = n() + as.integer(runif(1, 1, 999)) ) %>%
pad()
结果:
# A tibble: 307 × 2
time_stamp_day nr
<date> <int>
1 2015-12-12 79
2 2015-12-13 42
3 2015-12-14 NA
4 2015-12-15 NA
5 2015-12-16 NA
6 2015-12-17 NA
7 2015-12-18 88
8 2015-12-19 NA
9 2015-12-20 NA
10 2015-12-21 NA
# ... with 297 more rows
现在我想将 42 线性插值到 88。我认为完成此操作的最佳方法是在内部使用 zoo::na.approx()
padr::fill_by_function()
:
emergency %>%
filter(title == 'EMS: DEHYDRATION') %>%
thicken(interval = 'day') %>%
group_by(time_stamp_day) %>%
summarise(nr = n() + as.integer(runif(1, 1, 99)) ) %>%
pad() %>%
fill_by_function(nr, na.approx)
但我收到以下错误:
Error in inds[i] <- which(colnames_x == as.character(cols[[i]])) :
replacement has length zero
关于如何开始解决此问题的任何想法?
你只需要mutate
做na.approx
:
library(tibble);library(zoo)
emergency <- as_tibble(read.table(text="time_stamp_day nr
1 2015-12-12 79
2 2015-12-13 42
3 2015-12-14 NA
4 2015-12-15 NA
5 2015-12-16 NA
6 2015-12-17 NA
7 2015-12-18 88
8 2015-12-19 NA
9 2015-12-20 NA
10 2015-12-21 NA",header=TRUE,stringsAsFactors=FALSE))
emergency %>% mutate(nr=na.approx(nr,na.rm =FALSE))
# A tibble: 10 × 2
time_stamp_day nr
<chr> <dbl>
1 2015-12-12 79.0
2 2015-12-13 42.0
3 2015-12-14 51.2
4 2015-12-15 60.4
5 2015-12-16 69.6
6 2015-12-17 78.8
7 2015-12-18 88.0
8 2015-12-19 NA
9 2015-12-20 NA
10 2015-12-21 NA