在 R 中绘制密度随时间变化的多张地图

Question

我不熟悉使用 R 映射数据，我需要一些关于复杂表示的帮助。我会尽量说清楚:)

我有一个数据集，用于统计自 1950 年以来瑞典每天的观测值。每一行都是一个观测值，带有纬度、经度、儒略日、日期和年份信息。我将瑞典分为三组纬度（1 为南，2 为中，3 为北）。我只关心纬度信息，这样可以根据需要将经度变成每个点的相同值。

我想根据这三组的观察密度绘制随时间变化的图。为此，我想表示我的数据集在不同关键年份的变化：1950/1975/2000/2021，因此我需要创建多张地图。另外，我想为每年的 February/15 的前 15 天和 March/15 的最后几天和 5 月的最后 15 天制作一张累积观测密度图；因此地图总数为 4*4 = 16。理想情况下，变化将由颜色渐变表示（颜色越深，观测值越多）。但如果不合适，我不介意其他建议。

我的大数据集的随机样本：

> dput(df[sample(nrow(df), 50),])
structure(list(lat = c("65", "64", "65", "59", "59", "57", "57", 
"68", "67", "63", "60", "61", "65", "59", "56", "65", "59", "57", 
"55", "59", "56", "56", "59", "60", "59", "55", "59", "59", "57", 
"55", "56", "57", "65", "59", "63", "59", "56", "59", "56", "56", 
"57", "63", "58", "59", "63", "61", "55", "58", "66", "57"), 
    long = c("21", "17", "21", "14", "14", "13", "12", "18", 
    "18", "20", "16", "14", "17", "16", "12", "16", "15", "14", 
    "12", "17", "12", "16", "18", "14", "14", "14", "18", "17", 
    "12", "13", "12", "12", "21", "13", "19", "16", "12", "18", 
    "16", "12", "12", "18", "12", "17", "20", "17", "12", "13", 
    "19", "12"), date = c("2009-03-29", "2006-04-06", "2019-03-31", 
    "2006-04-04", "1975-04-13", "2014-02-05", "1996-04-02", "2021-04-08", 
    "1995-04-12", "2004-04-12", "2018-04-07", "2021-03-28", "1988-04-01", 
    "2002-03-17", "2015-03-12", "2019-04-05", "2016-03-19", "2021-04-03", 
    "2014-02-08", "2015-03-13", "2021-03-09", "2005-02-07", "2013-03-31", 
    "1989-03-23", "1989-03-27", "2015-01-21", "2011-04-04", "2018-03-26", 
    "1987-03-23", "2011-01-31", "2014-02-09", "2004-01-17", "2012-04-20", 
    "2017-03-07", "2005-04-02", "2017-01-28", "2016-03-19", "1984-03-30", 
    "2005-01-29", "2021-03-06", "2008-02-03", "2017-03-22", "2019-03-10", 
    "2010-01-17", "2009-04-10", "2016-01-23", "2019-03-01", "2006-03-04", 
    "2014-04-23", "2009-03-15"), julian_day = c("88", "96", "90", 
    "94", "103", "36", "93", "98", "102", "103", "97", "87", 
    "92", "76", "71", "95", "79", "93", "39", "72", "68", "38", 
    "90", "82", "86", "21", "94", "85", "82", "31", "40", "17", 
    "111", "66", "92", "28", "79", "90", "29", "65", "34", "81", 
    "69", "17", "100", "23", "60", "63", "113", "74"), year = c(2009L, 
    2006L, 2019L, 2006L, 1975L, 2014L, 1996L, 2021L, 1995L, 2004L, 
    2018L, 2021L, 1988L, 2002L, 2015L, 2019L, 2016L, 2021L, 2014L, 
    2015L, 2021L, 2005L, 2013L, 1989L, 1989L, 2015L, 2011L, 2018L, 
    1987L, 2011L, 2014L, 2004L, 2012L, 2017L, 2005L, 2017L, 2016L, 
    1984L, 2005L, 2021L, 2008L, 2017L, 2019L, 2010L, 2009L, 2016L, 
    2019L, 2006L, 2014L, 2009L), lat_grouped = c("3", "2", "3", 
    "1", "1", "1", "1", "3", "3", "2", "2", "2", "3", "1", "1", 
    "3", "1", "1", "1", "1", "1", "1", "1", "2", "1", "1", "1", 
    "1", "1", "1", "1", "1", "3", "1", "2", "1", "1", "1", "1", 
    "1", "1", "2", "1", "1", "2", "2", "1", "1", "3", "1")), row.names = c(22330L, 
15394L, 44863L, 15258L, 1481L, 31695L, 6399L, 52043L, 6111L, 
11508L, 42184L, 51391L, 4308L, 8764L, 34675L, 45080L, 37042L, 
51743L, 31717L, 34723L, 50514L, 11892L, 30527L, 4572L, 4608L, 
33744L, 26476L, 41366L, 4006L, 25265L, 31741L, 10122L, 29059L, 
38340L, 12787L, 37827L, 37061L, 3029L, 11762L, 50464L, 18114L, 
39026L, 43835L, 23081L, 22811L, 36179L, 43641L, 13743L, 33608L, 
21917L), class = "data.frame")

我已经按照在 Internet 上找到的一些指南成功地创建了基础层，但是我不知道如何继续下去，而且我对我未能成功的所有不同方法感到困惑。

library(ggplot2)
library(gganimate)
library(gifski)
library(maps)
library(sf)
library(rgdal)

#map source: https://www.geoboundaries.org/data/1_3_3/zip/shapefile/

wd = "C:/Users/HP/Desktop/SWE_ADM0"
sweden <- readOGR(paste0(wd, "/SWE_ADM0.shp"), layer = "SWE_ADM0")
plot(sweden)

#To use the imported shapefile in ggmap, we need the fortify() function of the ggplot2 package.
sweden_fort <- ggplot2::fortify(sweden)

base_map <- ggplot(data = sweden_fort, mapping = aes(x=long, y=lat, group=group)) +
  geom_polygon(color = "black", fill = "white") +
  coord_quickmap() +
  theme_void()

base_map

我希望有人能够帮助我，如果有什么不清楚或信息丢失，我可以编辑我的 post :)

非常感谢。

Answer 1

如果我是你，我会使用 sf 对象，即使用 st_read() 读取 sweden 地图，而不是直接使用 readOGR() 然后使用 fortify()。这将使您可以使用 geom_sf() 而不是 geom_polygon()。此外，您应该简化正在使用的 sweden shapefile。你指的那个很详细，也就是很多行。如果您尝试在动画中使用它，渲染将花费数小时。您可以在不丢失情节相关细节的情况下大大简化它。将 df 也创建为 sf 对象---一个由 long/lat 点而不是线组成的对象---然后你就可以开始了。

因此，使用上面的 df 和您指向的瑞典地图，

library(tidyverse)
library(sf)
library(here)

#map source: https://www.geoboundaries.org/data/1_3_3/zip/shapefile/

## Simplify the map for quicker rendering 
sweden <- st_read(here("data", "SWE_ADM0", "SWE_ADM0.shp"), 
                  layer = "SWE_ADM0") |> 
  st_simplify(dTolerance = 1e3)
#> Reading layer `SWE_ADM0' from data source `scratch/data/SWE_ADM0/SWE_ADM0.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 1 feature and 8 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 10.98139 ymin: 55.33695 xmax: 24.16663 ymax: 69.05997
#> Geodetic CRS:  WGS 84


df <- structure(list(lat = c("65", "64", "65", "59", "59", "57", "57", 
"68", "67", "63", "60", "61", "65", "59", "56", "65", "59", "57", 
"55", "59", "56", "56", "59", "60", "59", "55", "59", "59", "57", 
"55", "56", "57", "65", "59", "63", "59", "56", "59", "56", "56", 
"57", "63", "58", "59", "63", "61", "55", "58", "66", "57"), 
    long = c("21", "17", "21", "14", "14", "13", "12", "18", 
    "18", "20", "16", "14", "17", "16", "12", "16", "15", "14", 
    "12", "17", "12", "16", "18", "14", "14", "14", "18", "17", 
    "12", "13", "12", "12", "21", "13", "19", "16", "12", "18", 
    "16", "12", "12", "18", "12", "17", "20", "17", "12", "13", 
    "19", "12"), date = c("2009-03-29", "2006-04-06", "2019-03-31", 
    "2006-04-04", "1975-04-13", "2014-02-05", "1996-04-02", "2021-04-08", 
    "1995-04-12", "2004-04-12", "2018-04-07", "2021-03-28", "1988-04-01", 
    "2002-03-17", "2015-03-12", "2019-04-05", "2016-03-19", "2021-04-03", 
    "2014-02-08", "2015-03-13", "2021-03-09", "2005-02-07", "2013-03-31", 
    "1989-03-23", "1989-03-27", "2015-01-21", "2011-04-04", "2018-03-26", 
    "1987-03-23", "2011-01-31", "2014-02-09", "2004-01-17", "2012-04-20", 
    "2017-03-07", "2005-04-02", "2017-01-28", "2016-03-19", "1984-03-30", 
    "2005-01-29", "2021-03-06", "2008-02-03", "2017-03-22", "2019-03-10", 
    "2010-01-17", "2009-04-10", "2016-01-23", "2019-03-01", "2006-03-04", 
    "2014-04-23", "2009-03-15"), julian_day = c("88", "96", "90", 
    "94", "103", "36", "93", "98", "102", "103", "97", "87", 
    "92", "76", "71", "95", "79", "93", "39", "72", "68", "38", 
    "90", "82", "86", "21", "94", "85", "82", "31", "40", "17", 
    "111", "66", "92", "28", "79", "90", "29", "65", "34", "81", 
    "69", "17", "100", "23", "60", "63", "113", "74"), year = c(2009L, 
    2006L, 2019L, 2006L, 1975L, 2014L, 1996L, 2021L, 1995L, 2004L, 
    2018L, 2021L, 1988L, 2002L, 2015L, 2019L, 2016L, 2021L, 2014L, 
    2015L, 2021L, 2005L, 2013L, 1989L, 1989L, 2015L, 2011L, 2018L, 
    1987L, 2011L, 2014L, 2004L, 2012L, 2017L, 2005L, 2017L, 2016L, 
    1984L, 2005L, 2021L, 2008L, 2017L, 2019L, 2010L, 2009L, 2016L, 
    2019L, 2006L, 2014L, 2009L), lat_grouped = c("3", "2", "3", 
    "1", "1", "1", "1", "3", "3", "2", "2", "2", "3", "1", "1", 
    "3", "1", "1", "1", "1", "1", "1", "1", "2", "1", "1", "1", 
    "1", "1", "1", "1", "1", "3", "1", "2", "1", "1", "1", "1", 
    "1", "1", "2", "1", "1", "2", "2", "1", "1", "3", "1")), row.names = c(22330L, 
15394L, 44863L, 15258L, 1481L, 31695L, 6399L, 52043L, 6111L, 
11508L, 42184L, 51391L, 4308L, 8764L, 34675L, 45080L, 37042L, 
51743L, 31717L, 34723L, 50514L, 11892L, 30527L, 4572L, 4608L, 
33744L, 26476L, 41366L, 4006L, 25265L, 31741L, 10122L, 29059L, 
38340L, 12787L, 37827L, 37061L, 3029L, 11762L, 50464L, 18114L, 
39026L, 43835L, 23081L, 22811L, 36179L, 43641L, 13743L, 33608L, 
21917L), class = "data.frame")

## Convert the given sample data to an `sf` object of points, setting
## the coordinate system to be the same as the `sweden` map 
df <- df |> 
  mutate(id = 1:nrow(df), 
         date = lubridate::ymd(date), 
         year = factor(lubridate::year(date))) |> 
  st_as_sf(coords = c("long", "lat"), crs = 4326)

# Subset the data to the years to you want, and create the plot
df_selected <- df |> 
  filter(year %in% c(1975, 1989, 2016, 2021))

ggplot() + 
  geom_sf(data = sweden) + 
  geom_sf(data = df_selected, 
                mapping = aes(color = lat_grouped)) + 
  
  facet_grid(lat_grouped ~ year) + 
  guides(color = "none")

您可以设置例如theme_void() 或地图主题摆脱网格线等。

更新：最后一次编辑，只是关于绘制密度的问题。一旦你计算了累积数据，你就可以用二维核密度估计覆盖你的地图。例如，这是一个非常粗略的第一次切割，由纬度组刻面。

ggplot() + 
  geom_sf(data = sweden) + 
  geom_density_2d_filled(data = df, 
                  mapping = aes(x = map_dbl(geometry, ~.[1]),
                                 y = map_dbl(geometry, ~.[2])),
                  alpha = 0.4) + 
  facet_wrap(~ lat_grouped)

此处的 map_dbl() 函数（来自 purrr 包）是一种进入 df 的几何列并首先提取 x （即经度），然后是 y（即纬度）数据，以便为 geom_density_2d() 提供计算其估计值所需的坐标。

在 R 中绘制密度随时间变化的多张地图

Mapping multiple maps with density change over time in R

maps

r

ggplot2

ggmap

sf