只读取 R 中的裁剪或栅格范围

Read only a crop or extent of a raster in R

我正在处理每日 tif 文件(数千个每日文件)中的大量数据。我正在分析 shapefile 区域中栅格的平均值,重复可能有数千个形状层。

我当前的 .tif 文件适用于整个国家/地区,而实际上我只需要每个 shapefile 图层的每个 tif 文件的一小部分区域。每个形状层都有一组不同的天数可以从中提取,即像这样的数据框:

df <- data.frame(shplayer=c("shp_layerbuffer1","shp_layerbuffer2", "shp_layerbuffer3"), start=c("2000_02_18", "2004_03_19", "2010_08_20"), end=c("2010_01_09", "2005_03_15", "2012_09_04"))

在 R 中,有没有办法在读取文件之前裁剪 .tif(或任何光栅类型文件)?这样我就可以只读取每个 tif 文件的裁剪区域

我设想了类似的东西(在整个日期集上重复):

library(sf)
library(raster)
shp_layerbuffer1 <- st_read("myshpfolder", layer="shp_layerbuffer1", quiet=T)

###EXAMPLE BUT doesn't work to crop the raster as it comes in
tempraster <- raster::raster("mytif_2000_02_18.tif", ext=extent(shp_layerbuffer1))

然后通常的 velox(或光栅)提取,重复。

在这种情况下,我会用 gdalUtils 包创建一个 virtual raster file (VRT)。这是创建一个或多个栅格数据集的子集(甚至虚拟马赛克)的最快方法。对于此 VRT,您可以使用 sf::st_bbox 设置 所需范围

最小(不可重现)示例为:

library(gdalUitls)
library(raster)
library(sf)

shp_layerbuffer1 <- st_read("myshpfolder", layer="shp_layerbuffer1", quiet=T)

gdalbuildvrt(gdalfile = "yourraster.tif", 
             output.vrt = "tmp.vrt", 
             te = st_bbox(shp_layerbuffer1))

tempraster <- raster("tmp.vrt")

VRT 基本上为新的 "raster" 文件创建一个虚拟 canvas。然后它从(一个或多个)基础光栅文件中引用相应的行和列。因此,您不需要创建一个全新的栅格,只需引用现有文件即可。 VRT 的文件大小不应超过几千字节。

如果您对数据而不是栅格对象(或vrt/tiff)感兴趣,那么这可能是一种方法:

如果您 "load" 来自文件的栅格,并不一定意味着它已加载到内存中,如 raster 的文档所述:

In many cases, e.g. when a RasterLayer is created from a file, it does (initially) not contain any cell (pixel) values in (RAM) memory, it only has the parameters that describe the RasterLayer. You can access cell-values with getValues, extract and related functions. You can assign new values with setValues and with replacement.

因此您可以先 "index" 栅格,然后使用 getValuesBlock 读取您感兴趣的区域中的值。

library(raster)

f <- system.file('external/test.grd',package = 'raster')

# index
r <- raster(f)

# check if in memory

inMemory(r)
#[1] FALSE # output

# this would be an extent from your overlapping shapefile
e <- extent(r,58,68,40,50)

# get cells from extent; either use cells as index directly or convert to rowcol

rowcol <- rowColFromCell(r,cellsFromExtent(r,e))

v <- getValuesBlock(r,row=rowcol[1,1],nrows=(rowcol[nrow(rowcol),1] - rowcol[1,1]),
                    col=rowcol[1,2],ncols=(rowcol[nrow(rowcol),2] - rowcol[1,2]))

v
# [1] 295.392 225.710 258.595 310.336 324.666 322.702 307.217 283.611 263.987 156.609 322.978 297.565 301.797 315.971
# [15] 323.605 326.920 317.894 297.138 270.027 225.769 337.503 323.388 314.900 308.877 314.556 343.555 344.035 315.400
# [29] 291.188 270.876 337.866 324.632 307.220 278.294 264.379 348.519 356.456 322.450 301.790 285.815 331.383 318.950
# [43] 299.246 262.026 230.869 294.012 320.274 312.777 300.513 285.317 321.075 311.447 296.952 278.519 270.279 283.797
# [57] 296.415 298.861 293.150 277.573 306.692 300.772 289.376 273.141 258.457 258.638 272.966 283.977 284.621 271.690
# [71] 286.749 286.617 275.618 247.307 198.884 193.865 240.465 268.687 277.303 271.431 260.336 271.458 263.977 231.071
# [85] 161.407 154.181 222.417 258.672 271.711 272.642 235.458 258.810 257.763 239.553 209.409 205.905 234.162 255.668
# [99] 266.260 270.532

编辑:

# check if raster was loaded after getValuesBlock
inMemory(r)
#[1] FALSE