R——递归地将区域划分为 ggplot2 choropleth 的 XY 坐标

R -- Translating recursively divided areas into XY coordinates for ggplot2 chloropleth

我已经为此制定了一个解决方案,但它很丑陋、临时且无法推广;我认为必须有更好的方法。我们的研究区域被划分为 100mx100m 块的网格,列名为 3-8,行名为 C-J。每个块分为四个样方,每个样方分为四个子方块,因此子方块名称类似于“4F23”;划分模式如下所示:

11 12 21 22

13 14 23 24

31 32 41 42

33 34 43 44

不同的数据集使用三种分辨率中的任何一种是合适的,所以我想要一些可以处理“4F”、“4F1”和“4F34”的东西。要在区域的 heatmap/chloropleth 中显示值,我需要一种方法来表示与 ggplot 更正交的嵌套方案——或者更好的是,告诉 ggplot 一些事情,以便它知道如何解释部分标签本身。我想要的是一种简单的方法来绘制我们任何研究的地图。我能想到的最好办法是使用翻译函数生成 XY 并将其绑定到数据框。

toyDF <- tibble(SECT = c('3E1', '5G3', '8H4'), HT = c(22,6,15))
# Translator function
SACoords <- function(sqr) {
    sqVec <- substring(sqr, seq(1, nchar(sqr), 1), seq(1, nchar(sqr), 1))
    rws <- 'JIHGFEDC'
    cl <- (as.integer(sqVec[1]) - 3) * 100
    rw <- (as.integer(gregexpr(sqVec[2], rws)) - 1) * 100
    qd <- ifelse(!is.na(sqVec[3]), 
        list(c(0,50), c(50,50), c(0,0), c(50,0))[as.integer(sqVec[3])],
        c(0,0))
    sq <- ifelse(!is.na(sqVec[4]), 
        list(c(0,25), c(25,25), c(0,0), c(25,0))[as.integer(sqVec[4])],
        c(0,0))
    coords <- data.frame(c(cl, rw), qd, sq)
    rowSums(coords, na.rm=TRUE)
}
#> SACoords('8C24')  # Test
#[1] 575 750  # Yes

# Mash in the coordinates (as lists)
toyXY <- toyDF %>% 
    mutate(coords = sapply(SECT, SACoords, simplify=FALSE))
# Got the coords in, but as lists -- difficult to work with;
# but can't mutate() into two columns with one operation, so
# redo it this ungainly way:
toyXY[,4:5] <- matrix(unlist(sapply(toyDF$SECT, SACoords)), 
    ncol=2, byrow=TRUE)
names(toyXY)[4:5] <- c('Xcoor', 'Ycoor')

# And finally to plot (in reality many observations per SECT)
toyXY %>% group_by(SECT) %>%
    mutate(MHT = mean(HT)) %>%
    ggplot(aes(xmin=Xcoor, xmax=Xcoor + 50, ymin=Ycoor, ymax=Ycoor+50)) +
        geom_rect(aes(fill = MHT))

有了完整的数据集,这就产生了我想要的结果,但它很糟糕。我最想要的(我认为)是让我的 SACoords() 成为一种转换,我可以将其插入 ggplot(aes()) 调用中,这样它将有效地理解 SECT 标签,希望允许我使用 geom_raster 而不是 geom_rect 并避免 xmin/xmax 及其笨拙的常数,这些常数必须根据给定研究的分辨率进行调整。下一个最好的可能是研究地图的模板化表示——2d 矩阵,24x32?列表列表的 6x8 矩阵? -- 但我不知道如何告诉 ggplot 阅读它。或者我真的应该把所有这些都包装在一个可以处理所有事情的更大的函数中吗?

如果您附上您的研究区域的插图会更清楚,但这是我对您正在寻找的内容的最佳猜测。如果我的理解是正确的,您可以在 dplyr 包的管道操作中执行所有翻译,这样可以更轻松地解释代码中每一步的内容。

请注意,出于说明目的,我使用了不同的 SECT 值。评论中的解释:

library(dplyr)
library(ggplot2)

# modify toyDF to include sections of different sizes
toyDF <- tibble::tibble(SECT = c("3E", "5G3", "8C24"), 
                HT = c(22, 6, 15))

toyDF %>%
  mutate(sqr = stringr::str_pad(SECT, 4, side = "right", pad = " ")) %>%
  tidyr::separate(sqr, into = c("x", "y", "quadrat", "subquadrat"), sep = 1:3) %>%

  # convert the first two letters of SECT into x/y coordinates for the centre of the area
  mutate(x = factor(x, levels = as.character(3:8)),
         y = factor(y, levels = LETTERS[10:3])) %>%
  mutate_at(vars(x, y),
            function(i) as.integer(i) * 100 - 50) %>%

  # adjust coordinates for quadrat, if applicable
  mutate(x = case_when(quadrat %in% c("1", "3") ~ x - 25,
                       quadrat %in% c("2", "4") ~ x + 25,
                       TRUE ~ x),
         y = case_when(quadrat %in% c("1", "2") ~ y + 25,
                       quadrat %in% c("3", "4") ~ y - 25,
                       TRUE ~ y)) %>%

  # further adjust coordinates for subquadrat, if applicable
  mutate(x = case_when(subquadrat %in% c("1", "3") ~ x - 12.5,
                       subquadrat %in% c("2", "4") ~ x + 12.5,
                       TRUE ~ x),
         y = case_when(subquadrat %in% c("1", "2") ~ y + 12.5,
                       subquadrat %in% c("3", "4") ~ y - 12.5,
                       TRUE ~ y)) %>%

  # specify appropriate width for each cell, depending on whether
  # subquadrat / quadrat has been defined
  mutate(width = case_when(subquadrat != " " ~ 25,
                           quadrat != " " ~ 50,
                           TRUE ~ 100)) %>%

  ggplot(aes(x = x, y = y, fill = HT)) +
  geom_tile(aes(height = width, width = width)) +
  scale_x_continuous(breaks = seq(50, 550, by = 100),
                     labels = as.character(3:8),
                     expand = c(0, 0)) +
  scale_y_continuous(breaks = seq(50, 750, by = 100),
                     labels = LETTERS[10:3],
                     expand = c(0, 0)) +
  coord_equal(xlim = c(0, 600), ylim = c(0, 800)) +
  theme_bw() +
  theme(panel.grid.major = element_blank(),
        axis.ticks = element_blank())