R 使用 linux 内存不足,但使用 Windows 内存不足

Out of memory on R using linux but not on Windows

我正在尝试在纯 R(通过终端)或 RStudio (1.4.1106) 上管理大型数据集或文件(例如 GeoTIFF),但是这两个应用程序每次在 Linux(Manjaro, x64、核心 i7 和 8 GB RAM)用于某些脚本(尤其是当使用 ggplot2 绘制栅格数据以生成高质量地图时,以及使用具有约 3000 行和 6 的 csv 文件的具有随机因素的 lmer 函数时列)。问题可能与内存管理有关,因为所有内存都已消耗。为了克服,我尝试了两个包来 limit/increase 内存大小,例如“unix”和“RAppArmor”。然而,如果内存大小有限,所有可用的 RAM 都会被耗尽,并显示著名的消息“无法分配向量...”。另一方面,如果内存大小增加到较高水平,R/RStudio 就会崩溃。在 Windows 上,以下代码非常有效地增加了内存大小(只需要将栅格绘制到 ggplot2 中):

    if(.Platform$OS.type == "windows") withAutoprint({
    memory.size()
    memory.size(TRUE)
    memory.limit()
    })
    memory.limit(size=56000)

但是,此功能不适用于Linux 系统。如前所述,我使用以下两个函数来管理 Manjaro 上的 RAM 内存:

    library(ulimit)
    memory_limit(10000)

    library(RAppArmor)
    rlimit_as(1e10)

请在下面找到与我的类似的可重现代码,包括栅格属性。前六行用在Windows上增加内存:


    #if(.Platform$OS.type == "windows") withAutoprint({
    #  memory.size()
    #  memory.size(TRUE)
    #  memory.limit()
    #})
    #memory.limit(size=56000)

    library(rgdal)
    library(raster)
    library(tidyverse)
    library(sf)
    library(rnaturalearth)
    library(rnaturalearthdata)
    library(viridis)
    library(ggspatial)

    test <- raster(nrows = 8280, ncols = 5760, xmn = -82, xmx = -34, ymn = -57, ymx = 12)
    vals <-  1:ncell(test)
    test <- setValues(test, vals)
    test
    names(test)

    testpts <-  rasterToPoints(test, spatial = TRUE)
    testdf  <- data.frame(testpts)
    rm(testpts, test)
    str(testdf)

    polygons_brazil <- ne_countries(country = "brazil", scale = "medium", returnclass = "sf")
    plot(polygons_brazil)
    polygons_southamerica <- ne_countries(country = c("argentina", "bolivia", "chile", "colombia", "ecuador", "guyana", "paraguay", "peru", "suriname", "uruguay", "venezuela"), scale = "medium", returnclass = "sf")
    plot(polygons_southamerica)
    polygons_ocean <- ne_download(type = "ocean", category = "physical", returnclass = "sf")
    plot(polygons_ocean)

    # R crashes after this point (ggplot2 is processed by some time)

    map <- ggplot() +
    geom_raster(data = testdf , aes(x = x, y = y, fill = layer), show.legend = TRUE) +
    geom_sf(data = polygons_ocean, color = "transparent", lwd = 0.35, fill = "white", show.legend = FALSE) +
    geom_sf(data = polygons_brazil, color = "darkgray", lwd = 0.35, fill = "transparent", show.legend = FALSE) +
    geom_sf(data = polygons_southamerica, color = "darkgray", lwd = 0.35, fill = "gray88", show.legend = FALSE) +
    scale_fill_viridis(breaks = c(1, 11923200, 23846400, 35769600, 47692800), limits = c(1, 47692800)) +
    guides(fill = guide_colorbar(keyheight = 6, ticks = FALSE, title = bquote(delta^18 *O))) +
    ylab("Latitude") +
    xlab("Longitude") +
    coord_sf(xlim = c(-76, -28), ylim = c(-36, 8), expand = FALSE) +
    theme(axis.text.y = element_text(size = 10, color = "black"),
    axis.text.x = element_text(size = 10, color = "black"),
    axis.title.y = element_text(size = 10, color = "black"),
    axis.title.x = element_text(size = 10, color = "black"),
    legend.title = element_text(size = 10),
    legend.text = element_text(size = 9.5),
    legend.box = "vertical",
    panel.background = element_rect(fill = "white"),
    panel.grid.major = element_line(color = "gray96", size = 0.50),
    panel.grid.minor = element_line(color = "gray96", size = 0.30),
    axis.line = element_line(color = "black", size = 0.5),
    panel.border = element_rect(color = "black", fill = NA, size = 0.5)) +
    annotation_scale(location = "br") +
    annotation_north_arrow(location = "br", which_north = "true", 
    pad_x = unit(0, "cm"), pad_y = unit(0.8, "cm"),
    style = north_arrow_fancy_orienteering)
    map
    ggsave("test.png", width = 9, height = 6, units = "in", dpi = 300)

谁能帮我解决这个问题?

在另一个论坛 (https://community.rstudio.com/t/out-of-memory-on-r-using-linux-but-not-on-windows/106549) 的成员的帮助下,我找到了解决方案。如前所述,崩溃是交换分区内存限制的结果。我将交换空间从 2 Gb 增加到 16 Gb,现在 R/RStudio 能够完成整个脚本。这是一项非常艰巨的任务,因为我所有的物理内存都已耗尽,并且有将近 15 Gb 的交换空间被吃掉了。