在 R 中插值大型栅格系列
Interpolating large raster series in R
我有 36 年的 gridded/raster 月度温度估计值,我想将其转换为每日估计值。现在,我将月度估计值设置在月中点,并进行简单的线性插值。为此,我尝试使用 中描述的 raster::calc 和 stats::approx。但是,这样做时,出现以下错误:
Error in is.infinite(v) : default method not implemented for type 'list'
下面是一些代码,有望提供某种模拟来重现问题。我认为问题在于如何处理 NA,因为末尾的 q_interp
位(没有栅格设置为 NA)。也就是说,我不太确定如何处理这些信息。
library(raster)
#The parameters of the problem
num_days = 9861
months_num = 324
num_na = 191780
#generate baseline rasters
r <- raster(nrows=360, ncols=720);
values(r) <- NA
x <- sapply(1:months_num, function(...) setValues(r, runif(ncell(r))))
#make them a stack
s = stack(x)
#define what x coordinates the rasters refer to (e.g. loosely convert monthly to daily). Probably not the most elegant solution in the world.
num_day_month = c(31,28,31,30,31,30,31,31,30,31,30,31)
days = as.character(seq(as.Date('1989/01/01'), as.Date('2015/12/31'), by = 'day'))
months = as.character(seq(as.Date('1989/01/01'), as.Date('2015/12/01'), by = 'month'))
months = substr(months, 1,nchar(months)-3)
mid_points = as.vector(lapply(months, function(x) grep(x,days,value =T)[round(length(grep(x,days,value =T))/2)]))
mp_loc = days %in% mid_points
#output is the monthly mid points on the daily scale
mp_day_locs = (1:length(days))[mp_loc]
#make some of the cells NA throughout the whole span. In the actual dataset, the NAs generally represent oceans.
s[sample(ncell(s), num_na)] = NA
#a function to interpolate
interp_row <- function(base_indexes, value_vector, return_indexes, rule_num =2) {
nnn = length(value_vector)
if (any(is.na(value_vector))) {
return(rep(NA, nnn))
} else {
return(approx(x = base_indexes, y= value_vector, xout = return_indexes, rule=rule_num)$y)
}
}
#this is the function call that causes the error to be thrown
s_interp = calc(s, function(y) interp_row(base_indexes = mp_day_locs, value_vector = y, return_indexes = 1:length(days),rule_num = 2))
#Now make a without NAs-- seems to work
#generate baseline rasters
r <- raster(nrows=360, ncols=720);
values(r) <- NA
x <- sapply(1:months_num, function(...) setValues(r, runif(ncell(r))))
#make them a stack
q = stack(x)
q_interp = calc(q, function(y) interp_row(base_indexes = mp_day_locs, value_vector = y, return_indexes = 1:length(days),rule_num = 2))
问题(据我所知)是 return 如果任何值为 NA 创建的向量的长度。在您的情况下,它的长度与输入向量相同:
# length = 324
nnn = length(value_vector)
return(rep(NA, nnn))
然而,在正确的情况下(没有 NA)创建的 return 向量要长得多(每日值):
#length = 9861
return(approx(x = base_indexes, y= value_vector, xout = return_indexes, rule=rule_num)$y)
据我所知,两个 return 案例需要具有相同的长度。尝试通过设置 rep(NA, length(return_indexes))
:
来更改您的功能,如下所示
interp_row <- function(base_indexes, value_vector, return_indexes, rule_num =2) {
nnn = length(value_vector)
if (any(is.na(value_vector))) {
return(rep(NA, length(return_indexes)))
} else {
return(approx(x = base_indexes, y= value_vector, xout = return_indexes, rule=rule_num)$y)
}
}
注意:您的代码似乎很慢。一种快速解决方案是使用 clusterR()
函数来加速您的代码。此函数允许使用一些栅格函数的多核处理,例如 calc
。输入 ?clusterR
或查看 here.
我有 36 年的 gridded/raster 月度温度估计值,我想将其转换为每日估计值。现在,我将月度估计值设置在月中点,并进行简单的线性插值。为此,我尝试使用
Error in is.infinite(v) : default method not implemented for type 'list'
下面是一些代码,有望提供某种模拟来重现问题。我认为问题在于如何处理 NA,因为末尾的 q_interp
位(没有栅格设置为 NA)。也就是说,我不太确定如何处理这些信息。
library(raster)
#The parameters of the problem
num_days = 9861
months_num = 324
num_na = 191780
#generate baseline rasters
r <- raster(nrows=360, ncols=720);
values(r) <- NA
x <- sapply(1:months_num, function(...) setValues(r, runif(ncell(r))))
#make them a stack
s = stack(x)
#define what x coordinates the rasters refer to (e.g. loosely convert monthly to daily). Probably not the most elegant solution in the world.
num_day_month = c(31,28,31,30,31,30,31,31,30,31,30,31)
days = as.character(seq(as.Date('1989/01/01'), as.Date('2015/12/31'), by = 'day'))
months = as.character(seq(as.Date('1989/01/01'), as.Date('2015/12/01'), by = 'month'))
months = substr(months, 1,nchar(months)-3)
mid_points = as.vector(lapply(months, function(x) grep(x,days,value =T)[round(length(grep(x,days,value =T))/2)]))
mp_loc = days %in% mid_points
#output is the monthly mid points on the daily scale
mp_day_locs = (1:length(days))[mp_loc]
#make some of the cells NA throughout the whole span. In the actual dataset, the NAs generally represent oceans.
s[sample(ncell(s), num_na)] = NA
#a function to interpolate
interp_row <- function(base_indexes, value_vector, return_indexes, rule_num =2) {
nnn = length(value_vector)
if (any(is.na(value_vector))) {
return(rep(NA, nnn))
} else {
return(approx(x = base_indexes, y= value_vector, xout = return_indexes, rule=rule_num)$y)
}
}
#this is the function call that causes the error to be thrown
s_interp = calc(s, function(y) interp_row(base_indexes = mp_day_locs, value_vector = y, return_indexes = 1:length(days),rule_num = 2))
#Now make a without NAs-- seems to work
#generate baseline rasters
r <- raster(nrows=360, ncols=720);
values(r) <- NA
x <- sapply(1:months_num, function(...) setValues(r, runif(ncell(r))))
#make them a stack
q = stack(x)
q_interp = calc(q, function(y) interp_row(base_indexes = mp_day_locs, value_vector = y, return_indexes = 1:length(days),rule_num = 2))
问题(据我所知)是 return 如果任何值为 NA 创建的向量的长度。在您的情况下,它的长度与输入向量相同:
# length = 324
nnn = length(value_vector)
return(rep(NA, nnn))
然而,在正确的情况下(没有 NA)创建的 return 向量要长得多(每日值):
#length = 9861
return(approx(x = base_indexes, y= value_vector, xout = return_indexes, rule=rule_num)$y)
据我所知,两个 return 案例需要具有相同的长度。尝试通过设置 rep(NA, length(return_indexes))
:
interp_row <- function(base_indexes, value_vector, return_indexes, rule_num =2) {
nnn = length(value_vector)
if (any(is.na(value_vector))) {
return(rep(NA, length(return_indexes)))
} else {
return(approx(x = base_indexes, y= value_vector, xout = return_indexes, rule=rule_num)$y)
}
}
注意:您的代码似乎很慢。一种快速解决方案是使用 clusterR()
函数来加速您的代码。此函数允许使用一些栅格函数的多核处理,例如 calc
。输入 ?clusterR
或查看 here.