压缩输出栅格并并行化来自 R 的 gdalwarp
Compress output raster and parallelize gdalwarp from R
我想包含 -co
选项以使用 R 中 gdalUtilities 的 gdalwarp
压缩输出栅格。
我已经尝试了一些选项(在代码中有注释),但我没有成功生成压缩栅格。
gdalUtilities::gdalwarp(srcfile = paste0(source_path,"/mask_30.tif"),
dstfile = paste0(writing_path,"/mask_30_gdalwarp.tif"),
cutline = paste0(source_path,"/amazon.shp"),
crop_to_cutline = TRUE,
multi = TRUE,
wo = "NUM_THREADS = 32",
co = "COMPRESS = DEFLATE")
# co = c("COMPRESS = DEFLATE","ZLEVEL = 9"))
# co COMPRESS = DEFLATE,
# co ZLEVEL = 9),
# co = "COMPRESS = DEFLATE",
# co = ZLEVEL = 9")
此外,我想使用多线程变形实现。我包括 -multi
和 -wo "NUM_THREADS = 16"
(我的计算机有 32 个内核)选项,但我也无法减少运行时间与默认 -multi
选项相比,它使用两个内核默认。
对压缩和并行化有什么建议吗?
非常感谢。
1 - 压缩
请找到文件压缩问题的解决方案。老实说,我已经遇到过和你一样的问题,当时我绞尽脑汁......终于找到了非常简单的解决方案(一旦我们知道了!):你不能把任何空格(即 "COMPRESS=DEFLATE"
而不是 "COMPRESS = DEFLATE"
)
所以,请在下面找到一个小的代表。
Reprex
library(gdalUtilities)
library(stars) # Loaded just to have a '.tif' image for the reprex
# Import a '.tif' image from the 'stars' library
tif <- read_stars(system.file("tif/L7_ETMs.tif", package = "stars"))
# Write the image to disk (in your working directory)
write_stars(tif, "image.tif")
# Size of the image on disk (in bytes)
file.size("image.tif")
#> [1] 2950880
# Compress the image
gdalUtilities::gdalwarp(srcfile = "image.tif",
dstfile = "image_gdalwarp.tif",
co = "COMPRESS=DEFLATE")
# Size of the compressed image on disk (in bytes)
file.size("image_gdalwarp.tif")
#> [1] 937920 # The image has been successfully compressed.
正如@MarkAdler所说,默认压缩级别(即6)和级别9之间没有太大区别。也就是说,请在下面找到您应该如何编写代码才能应用所需的压缩级别(即仍然没有空格并在列表中):
gdalUtilities::gdalwarp(srcfile = "image.tif",
dstfile = "image_gdalwarp_Z9.tif",
co = list("COMPRESS=DEFLATE", "ZLEVEL=9"))
file.size("image_gdalwarp_Z9.tif")
#> [1] 901542
由 reprex package (v2.0.1)
于 2022-02-09 创建
2 - 并行化
对于处理器内核的并行化问题,你不应该使用multi = TRUE
。只有参数 wo = "NUM_THREADS=4"
(始终没有空格 ;-))就足够了。
澄清一下,我猜您混淆了 RAM 和核心数。通常计算机配备 4 或 8 核处理器。您在代码中指定的 32
指的是您的计算机可能拥有的 32 GB RAM。
Reprex
library(gdalUtilities)
library(stars)
tif <- read_stars(system.file("tif/L7_ETMs.tif", package = "stars"))
write_stars(tif, "image.tif")
file.size("image.tif")
#> [1] 2950880
gdalUtilities::gdalwarp(srcfile = "image.tif",
dstfile = "image_gdalwarp_Z9_parallel.tif",
co = list("COMPRESS=DEFLATE", "ZLEVEL=9"),
wo = "NUM_THREADS=4") # Replace '4' by '8' if your processor has 8 cores
file.size("image_gdalwarp_Z9_parallel.tif")
#> [1] 901542
由 reprex package (v2.0.1)
于 2022-02-09 创建
我想包含 -co
选项以使用 R 中 gdalUtilities 的 gdalwarp
压缩输出栅格。
我已经尝试了一些选项(在代码中有注释),但我没有成功生成压缩栅格。
gdalUtilities::gdalwarp(srcfile = paste0(source_path,"/mask_30.tif"),
dstfile = paste0(writing_path,"/mask_30_gdalwarp.tif"),
cutline = paste0(source_path,"/amazon.shp"),
crop_to_cutline = TRUE,
multi = TRUE,
wo = "NUM_THREADS = 32",
co = "COMPRESS = DEFLATE")
# co = c("COMPRESS = DEFLATE","ZLEVEL = 9"))
# co COMPRESS = DEFLATE,
# co ZLEVEL = 9),
# co = "COMPRESS = DEFLATE",
# co = ZLEVEL = 9")
此外,我想使用多线程变形实现。我包括 -multi
和 -wo "NUM_THREADS = 16"
(我的计算机有 32 个内核)选项,但我也无法减少运行时间与默认 -multi
选项相比,它使用两个内核默认。
对压缩和并行化有什么建议吗?
非常感谢。
1 - 压缩
请找到文件压缩问题的解决方案。老实说,我已经遇到过和你一样的问题,当时我绞尽脑汁......终于找到了非常简单的解决方案(一旦我们知道了!):你不能把任何空格(即 "COMPRESS=DEFLATE"
而不是 "COMPRESS = DEFLATE"
)
所以,请在下面找到一个小的代表。
Reprex
library(gdalUtilities)
library(stars) # Loaded just to have a '.tif' image for the reprex
# Import a '.tif' image from the 'stars' library
tif <- read_stars(system.file("tif/L7_ETMs.tif", package = "stars"))
# Write the image to disk (in your working directory)
write_stars(tif, "image.tif")
# Size of the image on disk (in bytes)
file.size("image.tif")
#> [1] 2950880
# Compress the image
gdalUtilities::gdalwarp(srcfile = "image.tif",
dstfile = "image_gdalwarp.tif",
co = "COMPRESS=DEFLATE")
# Size of the compressed image on disk (in bytes)
file.size("image_gdalwarp.tif")
#> [1] 937920 # The image has been successfully compressed.
正如@MarkAdler所说,默认压缩级别(即6)和级别9之间没有太大区别。也就是说,请在下面找到您应该如何编写代码才能应用所需的压缩级别(即仍然没有空格并在列表中):
gdalUtilities::gdalwarp(srcfile = "image.tif",
dstfile = "image_gdalwarp_Z9.tif",
co = list("COMPRESS=DEFLATE", "ZLEVEL=9"))
file.size("image_gdalwarp_Z9.tif")
#> [1] 901542
由 reprex package (v2.0.1)
于 2022-02-09 创建2 - 并行化
对于处理器内核的并行化问题,你不应该使用multi = TRUE
。只有参数 wo = "NUM_THREADS=4"
(始终没有空格 ;-))就足够了。
澄清一下,我猜您混淆了 RAM 和核心数。通常计算机配备 4 或 8 核处理器。您在代码中指定的 32
指的是您的计算机可能拥有的 32 GB RAM。
Reprex
library(gdalUtilities)
library(stars)
tif <- read_stars(system.file("tif/L7_ETMs.tif", package = "stars"))
write_stars(tif, "image.tif")
file.size("image.tif")
#> [1] 2950880
gdalUtilities::gdalwarp(srcfile = "image.tif",
dstfile = "image_gdalwarp_Z9_parallel.tif",
co = list("COMPRESS=DEFLATE", "ZLEVEL=9"),
wo = "NUM_THREADS=4") # Replace '4' by '8' if your processor has 8 cores
file.size("image_gdalwarp_Z9_parallel.tif")
#> [1] 901542
由 reprex package (v2.0.1)
于 2022-02-09 创建