基于值通过R中的ggmap生成空间热图

Generating spatial heat map via ggmap in R based on a value

我想使用以下数据点生成等值线图:

这是数据集 - https://www.dropbox.com/s/0s05cl34bko7ggm/sample_data.csv?dl=0

我想要地图显示价格较高的区域和价格较低的区域。它很可能看起来像这样(示例图片):

这是我的代码:

library(ggmap)

map <- get_map(location = "austin", zoom = 9)
data <- read.csv(file.choose(), stringsAsFactors = FALSE)
data$average_rate_per_night <- as.numeric(gsub("[\$,]", "", 
data$average_rate_per_night))
ggmap(map, extent = "device") + 
stat_contour( data = data, geom="polygon", 
            aes( x = longitude, y = latitude, z = average_rate_per_night, 
fill = ..level.. ) ) +
scale_fill_continuous( name = "Price", low = "yellow", high = "red" )

我收到以下错误消息:

2: Computation failed in `stat_contour()`:
Contour requires single `z` at each combination of `x` and `y`. 

对于如何修复此问题或使用任何其他方法生成此类热图的任何帮助,我将不胜感激。请注意,我感兴趣的是价格的权重,而不是记录的密度。

您可以使用 stat_summary_2d()stat_summary_hex() 函数来获得类似的结果.这些函数将数据划分为 bin(由 x 和 y 定义),然后根据给定函数汇总每个 bin 的 z 值。在下面的示例中,我选择了均值作为聚合函数,地图基本上显示了每个箱子中的平均价格。

注意:我需要适当地处理您的 average_rate_per_night 变量,以便将其转换为数字(删除 $ 符号和逗号)。

library(ggmap)
library(data.table)

map <- get_map(location = "austin", zoom = 12)
data <- setDT(read.csv(file.choose(), stringsAsFactors = FALSE))
data[, average_rate_per_night := as.numeric(gsub(",", "",
    substr(average_rate_per_night, 2, nchar(average_rate_per_night))))]

ggmap(map, extent = "device") +
    stat_summary_2d(data = data, aes(x = longitude, y = latitude, 
        z = average_rate_per_night), fun = mean, alpha = 0.6, bins = 30) +
    scale_fill_gradient(name = "Price", low = "green", high = "red") 

如果您坚持使用等高线方法,那么您需要为数据中的每个可能的 x,y 坐标组合提供一个值。为实现这一点,我强烈建议将 space 网格化并为每个 bin 生成一些汇总统计信息。

我根据您提供的数据在下面附上一个工作示例:

library(ggmap)
library(data.table)

map <- get_map(location = "austin", zoom = 12)
data <- setDT(read.csv(file.choose(), stringsAsFactors = FALSE))

# convert the rate from string into numbers
data[, average_rate_per_night := as.numeric(gsub(",", "", 
       substr(average_rate_per_night, 2, nchar(average_rate_per_night))))]

# generate bins for the x, y coordinates
xbreaks <- seq(floor(min(data$latitude)), ceiling(max(data$latitude)), by = 0.01)
ybreaks <- seq(floor(min(data$longitude)), ceiling(max(data$longitude)), by = 0.01)

# allocate the data points into the bins
data$latbin <- xbreaks[cut(data$latitude, breaks = xbreaks, labels=F)]
data$longbin <- ybreaks[cut(data$longitude, breaks = ybreaks, labels=F)]

# Summarise the data for each bin
datamat <- data[, list(average_rate_per_night = mean(average_rate_per_night)), 
                 by = c("latbin", "longbin")]

# Merge the summarised data with all possible x, y coordinate combinations to get 
# a value for every bin
datamat <- merge(setDT(expand.grid(latbin = xbreaks, longbin = ybreaks)), datamat, 
                 by = c("latbin", "longbin"), all.x = TRUE, all.y = FALSE)

# Fill up the empty bins 0 to smooth the contour plot
datamat[is.na(average_rate_per_night), ]$average_rate_per_night <- 0

# Plot the contours
ggmap(map, extent = "device") +
  stat_contour(data = datamat, aes(x = longbin, y = latbin, z = average_rate_per_night, 
               fill = ..level.., alpha = ..level..), geom = 'polygon', binwidth = 100) +
  scale_fill_gradient(name = "Price", low = "green", high = "red") +
  guides(alpha = FALSE)

然后您可以调整 bin 大小和轮廓 binwidth 以获得所需的结果,但您还可以在网格上应用平滑函数以获得更平滑的轮廓剧情.