结合地理空间中的分类和渐变填充 - R

Question

我正在尝试在地图上填充组合的分类变量和连续变量。因此，例如，在我下面的最小可重现示例中，假设我想显示每个县的 KrispyKreme 甜甜圈店的数量，这通常是我想在梯度上填充的连续变量。但我也有禁止 KrispyKremes 的县，用“-1”表示，而那些正在建设中的县则用“-2”表示。我想以未映射到渐变上的不同颜色显示这些。我的真实数据中也有 NA。

--我目前有：

library(sf)
library(ggplot2)

nc <- st_read(system.file("shape/nc.shp", package="sf"))
nc$Status<-rep(c(-2,-1,runif(8)), 10)

ggplot(nc) + 
  geom_sf(aes(fill=Status),color = "black") + 
  coord_sf(datum = NA) + 
  theme_minimal()

显然，如果我添加以下行，它就会中断。所以，我知道我的语法有误，但它表明了我想尽我所能为这个计算代码

  scale_fill_manual(breaks= c("-2","-1", >=0),values = c("blue", "yellow", scale_fill_viridis()))

非常感谢任何帮助，我已经忙了一整天了。

Answer 1

您需要将连续变量分成不同的类别。

library(sf)
library(ggplot2)
library(dplyr)

# Set seed for reproducibility
set.seed(122)

nc <- st_read(system.file("shape/nc.shp", package="sf"))
nc$Status<-rep(c(-2,-1,runif(8)), 10)

首先，检查变量的分布。

nc %>%
  filter(Status >= 0) %>%
  pull("Status") %>%
  summary()
#     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
# 0.002789 0.153144 0.602395 0.491287 0.735787 0.906851

我决定根据分位数切割变量如下。

nc2 <- nc %>%
  mutate(Status2 = case_when(
    Status == -2 ~ "-2",
    Status == -1 ~ "-1",
    Status >= 0 & Status < 0.15 ~ "0 - 0.15",
    Status >= 0.15 & Status < 0.6 ~ "0.15 - 0.6",
    Status >= 0.6 & Status < 0.75 ~ "0.6 - 0.75",
    Status >= 0.75                ~ "0.75 - 0.91"
  ))

现在 Status2 是一个分类变量。我们可以绘制它并使用 scale_fill_manual 来提供颜色。请注意，我们需要在 values 参数中提供颜色代码。 viridis::viridis(4)是根据viridis生成四种颜色。

ggplot(nc2) + 
  geom_sf(aes(fill=Status2),color = "black") + 
  coord_sf(datum = NA) + 
  theme_minimal() +
  scale_fill_manual(values = c("blue", "yellow", viridis::viridis(4)))

Answer 2

非常感谢。上面构造 Status2 的方式使它成为一个字符变量。我想绘制一个分类变量。下面的代码改为生成一个因子变量 (Status3) 并将其绘制在地图中。有效。

library(sf) 
library(ggplot2) 
library(dplyr)

nc <- st_read(system.file("shape/nc.shp", package="sf")) 
nc$Status<-rep(c(-2,-1,runif(8)), 10)

nc3 <- nc %>%
  mutate(Status3 = factor(ifelse(Status>0,1,0)))

ggplot(nc3) + 
  geom_sf(aes(fill=Status3),color = "black") + 
  coord_sf(datum = NA) + 
  theme_minimal()

但是，当我尝试将相同的原理（基于连续变量构造一个因子变量并将其绘制成地图）应用于我的代码时，我收到了错误。

Error in if (type == "point") { : argument is of length zero

我的代码如下。该代码在绘制连续变量时有效，但在绘制因子变量时无效。有谁知道为什么？

# plotting continuous variable: WORKS FINE
ggplot(CS_mun_shp)+
  geom_sf(aes(geometry=geometry,
              fill=ppc_sih),
          color=NA) 

# constructing factor variable
CS_mun_shp2 <- CS_mun_shp %>%
  mutate(cs_above40=factor(ifelse(ppc_sih>=0.4,1,0), 
                           levels=c(0:1), 
                           labels=c('below 40%','above 40%')))

# plotting factor variable: GENERATES ERROR  
ggplot(CS_mun_shp2)+
  geom_sf(aes(geometry=geometry,
              fill=cs_above40),
          color=NA)

我的代码与上面的可重现示例的唯一区别是我需要在 aes() 中指定 geometry，否则会出现另一个错误。

结合地理空间中的分类和渐变填充 - R

Combine Categorical and Gradient Fill in Geospatial - R

r

ggplot2

geospatial

sf