在 R 中插值大型栅格系列

Interpolating large raster series in R

我有 36 年的 gridded/raster 月度温度估计值,我想将其转换为每日估计值。现在,我将月度估计值设置在月中点,并进行简单的线性插值。为此,我尝试使用 中描述的 raster::calc 和 stats::approx。但是,这样做时,出现以下错误:

Error in is.infinite(v) : default method not implemented for type 'list'

下面是一些代码,有望提供某种模拟来重现问题。我认为问题在于如何处理 NA,因为末尾的 q_interp 位(没有栅格设置为 NA)。也就是说,我不太确定如何处理这些信息。

library(raster)

#The parameters of the problem
num_days = 9861
months_num = 324
num_na = 191780

#generate baseline rasters
r <- raster(nrows=360, ncols=720); 
values(r) <- NA
x <- sapply(1:months_num, function(...) setValues(r, runif(ncell(r))))

#make them a stack
s = stack(x)

#define what x coordinates the rasters refer to (e.g. loosely convert monthly to daily). Probably not the most elegant solution in the world.
num_day_month = c(31,28,31,30,31,30,31,31,30,31,30,31)
days = as.character(seq(as.Date('1989/01/01'), as.Date('2015/12/31'), by = 'day'))
months = as.character(seq(as.Date('1989/01/01'), as.Date('2015/12/01'), by = 'month'))
months = substr(months, 1,nchar(months)-3)
mid_points = as.vector(lapply(months, function(x) grep(x,days,value =T)[round(length(grep(x,days,value =T))/2)]))
mp_loc = days %in% mid_points
#output is the monthly mid points on the daily scale
mp_day_locs = (1:length(days))[mp_loc]

#make some of the cells NA throughout the whole span. In the actual dataset, the NAs generally represent oceans.
s[sample(ncell(s), num_na)] = NA

#a function to interpolate
interp_row <- function(base_indexes, value_vector, return_indexes, rule_num =2) {
  nnn = length(value_vector)
  if (any(is.na(value_vector))) {
    return(rep(NA, nnn))
  } else {
    return(approx(x = base_indexes, y= value_vector, xout = return_indexes, rule=rule_num)$y)
  }
}

#this is the function call that causes the error to be thrown
s_interp = calc(s, function(y) interp_row(base_indexes = mp_day_locs, value_vector = y, return_indexes = 1:length(days),rule_num = 2))

#Now make a without NAs-- seems to work
#generate baseline rasters
r <- raster(nrows=360, ncols=720); 
values(r) <- NA
x <- sapply(1:months_num, function(...) setValues(r, runif(ncell(r))))
#make them a stack
q = stack(x)
q_interp = calc(q, function(y) interp_row(base_indexes = mp_day_locs, value_vector = y, return_indexes = 1:length(days),rule_num = 2))

问题(据我所知)是 return 如果任何值为 NA 创建的向量的长度。在您的情况下,它的长度与输入向量相同:

# length = 324
nnn = length(value_vector)
return(rep(NA, nnn))

然而,在正确的情况下(没有 NA)创建的 return 向量要长得多(每日值):

#length = 9861
return(approx(x = base_indexes, y= value_vector, xout = return_indexes, rule=rule_num)$y)

据我所知,两个 return 案例需要具有相同的长度。尝试通过设置 rep(NA, length(return_indexes)) :

来更改您的功能,如下所示
interp_row <- function(base_indexes, value_vector, return_indexes, rule_num =2) {
  nnn = length(value_vector)
  if (any(is.na(value_vector))) {
    return(rep(NA, length(return_indexes)))
  } else {
    return(approx(x = base_indexes, y= value_vector, xout = return_indexes, rule=rule_num)$y)
  }
}

注意:您的代码似乎很慢。一种快速解决方案是使用 clusterR() 函数来加速您的代码。此函数允许使用一些栅格函数的多核处理,例如 calc。输入 ?clusterR 或查看 here.