基于值通过R中的ggmap生成空间热图
Generating spatial heat map via ggmap in R based on a value
我想使用以下数据点生成等值线图:
- 经度
- 纬度
- 价格
这是数据集 - https://www.dropbox.com/s/0s05cl34bko7ggm/sample_data.csv?dl=0。
我想要地图显示价格较高的区域和价格较低的区域。它很可能看起来像这样(示例图片):
这是我的代码:
library(ggmap)
map <- get_map(location = "austin", zoom = 9)
data <- read.csv(file.choose(), stringsAsFactors = FALSE)
data$average_rate_per_night <- as.numeric(gsub("[\$,]", "",
data$average_rate_per_night))
ggmap(map, extent = "device") +
stat_contour( data = data, geom="polygon",
aes( x = longitude, y = latitude, z = average_rate_per_night,
fill = ..level.. ) ) +
scale_fill_continuous( name = "Price", low = "yellow", high = "red" )
我收到以下错误消息:
2: Computation failed in `stat_contour()`:
Contour requires single `z` at each combination of `x` and `y`.
对于如何修复此问题或使用任何其他方法生成此类热图的任何帮助,我将不胜感激。请注意,我感兴趣的是价格的权重,而不是记录的密度。
您可以使用 stat_summary_2d() 或 stat_summary_hex() 函数来获得类似的结果.这些函数将数据划分为 bin(由 x 和 y 定义),然后根据给定函数汇总每个 bin 的 z 值。在下面的示例中,我选择了均值作为聚合函数,地图基本上显示了每个箱子中的平均价格。
注意:我需要适当地处理您的 average_rate_per_night 变量,以便将其转换为数字(删除 $ 符号和逗号)。
library(ggmap)
library(data.table)
map <- get_map(location = "austin", zoom = 12)
data <- setDT(read.csv(file.choose(), stringsAsFactors = FALSE))
data[, average_rate_per_night := as.numeric(gsub(",", "",
substr(average_rate_per_night, 2, nchar(average_rate_per_night))))]
ggmap(map, extent = "device") +
stat_summary_2d(data = data, aes(x = longitude, y = latitude,
z = average_rate_per_night), fun = mean, alpha = 0.6, bins = 30) +
scale_fill_gradient(name = "Price", low = "green", high = "red")
如果您坚持使用等高线方法,那么您需要为数据中的每个可能的 x,y 坐标组合提供一个值。为实现这一点,我强烈建议将 space 网格化并为每个 bin 生成一些汇总统计信息。
我根据您提供的数据在下面附上一个工作示例:
library(ggmap)
library(data.table)
map <- get_map(location = "austin", zoom = 12)
data <- setDT(read.csv(file.choose(), stringsAsFactors = FALSE))
# convert the rate from string into numbers
data[, average_rate_per_night := as.numeric(gsub(",", "",
substr(average_rate_per_night, 2, nchar(average_rate_per_night))))]
# generate bins for the x, y coordinates
xbreaks <- seq(floor(min(data$latitude)), ceiling(max(data$latitude)), by = 0.01)
ybreaks <- seq(floor(min(data$longitude)), ceiling(max(data$longitude)), by = 0.01)
# allocate the data points into the bins
data$latbin <- xbreaks[cut(data$latitude, breaks = xbreaks, labels=F)]
data$longbin <- ybreaks[cut(data$longitude, breaks = ybreaks, labels=F)]
# Summarise the data for each bin
datamat <- data[, list(average_rate_per_night = mean(average_rate_per_night)),
by = c("latbin", "longbin")]
# Merge the summarised data with all possible x, y coordinate combinations to get
# a value for every bin
datamat <- merge(setDT(expand.grid(latbin = xbreaks, longbin = ybreaks)), datamat,
by = c("latbin", "longbin"), all.x = TRUE, all.y = FALSE)
# Fill up the empty bins 0 to smooth the contour plot
datamat[is.na(average_rate_per_night), ]$average_rate_per_night <- 0
# Plot the contours
ggmap(map, extent = "device") +
stat_contour(data = datamat, aes(x = longbin, y = latbin, z = average_rate_per_night,
fill = ..level.., alpha = ..level..), geom = 'polygon', binwidth = 100) +
scale_fill_gradient(name = "Price", low = "green", high = "red") +
guides(alpha = FALSE)
然后您可以调整 bin 大小和轮廓 binwidth 以获得所需的结果,但您还可以在网格上应用平滑函数以获得更平滑的轮廓剧情.
我想使用以下数据点生成等值线图:
- 经度
- 纬度
- 价格
这是数据集 - https://www.dropbox.com/s/0s05cl34bko7ggm/sample_data.csv?dl=0。
我想要地图显示价格较高的区域和价格较低的区域。它很可能看起来像这样(示例图片):
这是我的代码:
library(ggmap)
map <- get_map(location = "austin", zoom = 9)
data <- read.csv(file.choose(), stringsAsFactors = FALSE)
data$average_rate_per_night <- as.numeric(gsub("[\$,]", "",
data$average_rate_per_night))
ggmap(map, extent = "device") +
stat_contour( data = data, geom="polygon",
aes( x = longitude, y = latitude, z = average_rate_per_night,
fill = ..level.. ) ) +
scale_fill_continuous( name = "Price", low = "yellow", high = "red" )
我收到以下错误消息:
2: Computation failed in `stat_contour()`:
Contour requires single `z` at each combination of `x` and `y`.
对于如何修复此问题或使用任何其他方法生成此类热图的任何帮助,我将不胜感激。请注意,我感兴趣的是价格的权重,而不是记录的密度。
您可以使用 stat_summary_2d() 或 stat_summary_hex() 函数来获得类似的结果.这些函数将数据划分为 bin(由 x 和 y 定义),然后根据给定函数汇总每个 bin 的 z 值。在下面的示例中,我选择了均值作为聚合函数,地图基本上显示了每个箱子中的平均价格。
注意:我需要适当地处理您的 average_rate_per_night 变量,以便将其转换为数字(删除 $ 符号和逗号)。
library(ggmap)
library(data.table)
map <- get_map(location = "austin", zoom = 12)
data <- setDT(read.csv(file.choose(), stringsAsFactors = FALSE))
data[, average_rate_per_night := as.numeric(gsub(",", "",
substr(average_rate_per_night, 2, nchar(average_rate_per_night))))]
ggmap(map, extent = "device") +
stat_summary_2d(data = data, aes(x = longitude, y = latitude,
z = average_rate_per_night), fun = mean, alpha = 0.6, bins = 30) +
scale_fill_gradient(name = "Price", low = "green", high = "red")
如果您坚持使用等高线方法,那么您需要为数据中的每个可能的 x,y 坐标组合提供一个值。为实现这一点,我强烈建议将 space 网格化并为每个 bin 生成一些汇总统计信息。
我根据您提供的数据在下面附上一个工作示例:
library(ggmap)
library(data.table)
map <- get_map(location = "austin", zoom = 12)
data <- setDT(read.csv(file.choose(), stringsAsFactors = FALSE))
# convert the rate from string into numbers
data[, average_rate_per_night := as.numeric(gsub(",", "",
substr(average_rate_per_night, 2, nchar(average_rate_per_night))))]
# generate bins for the x, y coordinates
xbreaks <- seq(floor(min(data$latitude)), ceiling(max(data$latitude)), by = 0.01)
ybreaks <- seq(floor(min(data$longitude)), ceiling(max(data$longitude)), by = 0.01)
# allocate the data points into the bins
data$latbin <- xbreaks[cut(data$latitude, breaks = xbreaks, labels=F)]
data$longbin <- ybreaks[cut(data$longitude, breaks = ybreaks, labels=F)]
# Summarise the data for each bin
datamat <- data[, list(average_rate_per_night = mean(average_rate_per_night)),
by = c("latbin", "longbin")]
# Merge the summarised data with all possible x, y coordinate combinations to get
# a value for every bin
datamat <- merge(setDT(expand.grid(latbin = xbreaks, longbin = ybreaks)), datamat,
by = c("latbin", "longbin"), all.x = TRUE, all.y = FALSE)
# Fill up the empty bins 0 to smooth the contour plot
datamat[is.na(average_rate_per_night), ]$average_rate_per_night <- 0
# Plot the contours
ggmap(map, extent = "device") +
stat_contour(data = datamat, aes(x = longbin, y = latbin, z = average_rate_per_night,
fill = ..level.., alpha = ..level..), geom = 'polygon', binwidth = 100) +
scale_fill_gradient(name = "Price", low = "green", high = "red") +
guides(alpha = FALSE)
然后您可以调整 bin 大小和轮廓 binwidth 以获得所需的结果,但您还可以在网格上应用平滑函数以获得更平滑的轮廓剧情.