压缩输出栅格并并行化来自 R 的 gdalwarp

Compress output raster and parallelize gdalwarp from R

我想包含 -co 选项以使用 R 中 gdalUtilities 的 gdalwarp 压缩输出栅格。

我已经尝试了一些选项(在代码中有注释),但我没有成功生成压缩栅格。

gdalUtilities::gdalwarp(srcfile = paste0(source_path,"/mask_30.tif"),
                        dstfile = paste0(writing_path,"/mask_30_gdalwarp.tif"),
                        cutline = paste0(source_path,"/amazon.shp"),
                        crop_to_cutline = TRUE,
                        multi = TRUE,
                        wo = "NUM_THREADS = 32",
                        co = "COMPRESS = DEFLATE")
                        # co = c("COMPRESS = DEFLATE","ZLEVEL = 9"))
                        # co COMPRESS = DEFLATE,
                        # co ZLEVEL = 9),
                        # co = "COMPRESS = DEFLATE",
                        # co = ZLEVEL = 9")

此外,我想使用多线程变形实现。我包括 -multi-wo "NUM_THREADS = 16"(我的计算机有 32 个内核)选项,但我也无法减少运行时间与默认 -multi 选项相比,它使用两个内核默认。

对压缩和并行化有什么建议吗?

非常感谢。

1 - 压缩

请找到文件压缩问题的解决方案。老实说,我已经遇到过和你一样的问题,当时我绞尽脑汁......终于找到了非常简单的解决方案(一旦我们知道了!):你不能把任何空格(即 "COMPRESS=DEFLATE" 而不是 "COMPRESS = DEFLATE"

所以,请在下面找到一个小的代表。

Reprex

library(gdalUtilities)
library(stars)        # Loaded just to have a '.tif' image for the reprex


# Import a '.tif' image from the 'stars' library
tif <- read_stars(system.file("tif/L7_ETMs.tif", package = "stars"))

# Write the image to disk (in your working directory)
write_stars(tif, "image.tif")

# Size of the image on disk (in bytes)
file.size("image.tif")
#> [1] 2950880

# Compress the image
gdalUtilities::gdalwarp(srcfile = "image.tif",
                        dstfile = "image_gdalwarp.tif",
                        co = "COMPRESS=DEFLATE")

# Size of the compressed image on disk (in bytes)
file.size("image_gdalwarp.tif")
#> [1] 937920                # The image has been successfully compressed.

正如@MarkAdler所说,默认压缩级别(即6)和级别9之间没有太大区别。也就是说,请在下面找到您应该如何编写代码才能应用所需的压缩级别(即仍然没有空格并在列表中):

gdalUtilities::gdalwarp(srcfile = "image.tif",
                        dstfile = "image_gdalwarp_Z9.tif",
                        co = list("COMPRESS=DEFLATE", "ZLEVEL=9"))

file.size("image_gdalwarp_Z9.tif")
#> [1] 901542

reprex package (v2.0.1)

于 2022-02-09 创建

2 - 并行化

对于处理器内核的并行化问题,你不应该使用multi = TRUE。只有参数 wo = "NUM_THREADS=4"(始终没有空格 ;-))就足够了。

澄清一下,我猜您混淆了 RAM 和核心数。通常计算机配备 4 或 8 核处理器。您在代码中指定的 32 指的是您的计算机可能拥有的 32 GB RAM。

Reprex

library(gdalUtilities)
library(stars)

tif <- read_stars(system.file("tif/L7_ETMs.tif", package = "stars"))

write_stars(tif, "image.tif")


file.size("image.tif")
#> [1] 2950880

gdalUtilities::gdalwarp(srcfile = "image.tif",
                        dstfile = "image_gdalwarp_Z9_parallel.tif",
                        co = list("COMPRESS=DEFLATE", "ZLEVEL=9"),
                        wo = "NUM_THREADS=4") # Replace '4' by '8' if your processor has 8 cores

file.size("image_gdalwarp_Z9_parallel.tif")
#> [1] 901542

reprex package (v2.0.1)

于 2022-02-09 创建