有没有办法根据 R 中的两个街道名称对交叉路口进行地理编码?
Is there a way to geocode intersection based on two street names in R?
我有一个看起来像这样的数据集
ID|Cross Street 1|Cross Street 2
1 Beaver St Hanover St
2 Pearl St Wall St
我想得到 2263 个 x 和 y 值
ID|Cross Street 1|Cross Street 2 x |y
1 Beaver St Hanover St 981740.53187633 196247.34676349
2 Pearl St Wall St 982049.05259918 196320.89988222
注意下面的纬度和经度是为了确认位置。
lat |lon
40.705330 -74.009051
40.7055319680708 -74.00793826989502
这是一个使用 osmdata
软件包 (OpenStreetMap) 下载的澳大利亚堪培拉市一小部分数据的示例。
首先,加载库并制作我们感兴趣的区域的边界框:
library(sf)
library(tidyverse)
library(osmdata)
library(tmap)
tmap_mode('view')
# sample data with our focal area
tribble(
~point, ~lat, ~lon,
1, -35.29, 149.12,
2, -35.29, 149.14,
3, -35.27, 149.14,
4, -35.27, 149.12,
) %>%
st_as_sf(
coords = c('lon', 'lat'),
crs = 4326
) %>%
{. ->> my_points}
my_points
# Simple feature collection with 4 features and 1 field
# Geometry type: POINT
# Dimension: XY
# Bounding box: xmin: 149.12 ymin: -35.29 xmax: 149.14 ymax: -35.27
# Geodetic CRS: WGS 84
# # A tibble: 4 x 2
# point geometry
# * <dbl> <POINT [°]>
# 1 1 (149.12 -35.29)
# 2 2 (149.14 -35.29)
# 3 3 (149.14 -35.27)
# 4 4 (149.12 -35.27)
并绘制它以查看面积:
tm_shape(my_points)+
tm_dots()
然后,我们导入OSM数据。我不是这里的专家,我只是在关注 here.
中的示例
# import OSM data
st_bbox(my_points) %>%
opq %>%
add_osm_feature(key = 'highway') %>%
osmdata_sf %>%
`[[`('osm_lines') %>%
{. ->> my_streets}
my_streets
# Simple feature collection with 2143 features and 87 fields
# Geometry type: LINESTRING
# Dimension: XY
# Bounding box: xmin: 149.1153 ymin: -35.29224 xmax: 149.1447 ymax: -35.26598
# Geodetic CRS: WGS 84
# First 10 features:
# osm_id name access amenity area bicycle bridge bridge_number building bus bus.lanes busway
# 4018757 4018757 Coranderrk Street <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4189987 4189987 London Circuit <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4189988 4189988 London Circuit <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4204003 4204003 Parkes Way <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4204005 4204005 Parkes Way <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4204006 4204006 Parkes Way <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4368141 4368141 Parkes Way <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4414782 4414782 Watson Street <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4414873 4414873 Gould Street <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4574783 4574783 Macleay Street <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
为了示例,我只保留 residential
条街道(highway
列),然后使用 group_by()
和 summarise()
连接所有段一条同名的街道,然后删除没有名称的街道。
# keep only residential streets
my_streets %>%
filter(
highway == 'residential'
) %>%
select(name) %>%
group_by(name) %>%
summarise %>%
filter(
!is.na(name)
) %>%
{. ->> my_streets_2}
my_streets_2
# Simple feature collection with 31 features and 1 field
# Geometry type: GEOMETRY
# Dimension: XY
# Bounding box: xmin: 149.1201 ymin: -35.28725 xmax: 149.1437 ymax: -35.26933
# Geodetic CRS: WGS 84
# # A tibble: 31 x 2
# name geometry
# * <chr> <GEOMETRY [°]>
# 1 Ainslie Avenue MULTILINESTRING ((149.1355 -35.27924, 149.1356 -35.2792, 149.1357 -35.27915, 149.1358 -35.27909, .~
# 2 Akuna Street LINESTRING (149.1361 -35.28036, 149.136 -35.2804, 149.1356 -35.28057, 149.1353 -35.28068, 149.134.~
# 3 Allambee Street LINESTRING (149.1378 -35.27985, 149.1379 -35.27967, 149.138 -35.27962, 149.1391 -35.27918, 149.13.~
# 4 Allara Street LINESTRING (149.131 -35.28574, 149.131 -35.28568, 149.1311 -35.28563, 149.1312 -35.2855, 149.1316.~
# 5 Amaroo Street MULTILINESTRING ((149.142 -35.28725, 149.1419 -35.28721, 149.1417 -35.2871, 149.1415 -35.28698, 1.~
# 6 Batman Street MULTILINESTRING ((149.1366 -35.27775, 149.1367 -35.2777, 149.138 -35.27721, 149.1381 -35.27717, 1.~
# 7 Binara Street LINESTRING (149.1338 -35.28256, 149.1339 -35.2825, 149.134 -35.28245, 149.1341 -35.28239, 149.134.~
# 8 Boolee Street MULTILINESTRING ((149.1368 -35.2813, 149.1369 -35.28125, 149.1376 -35.281, 149.1381 -35.28079, 14.~
# 9 Booroondara Street LINESTRING (149.1427 -35.28638, 149.1427 -35.28634, 149.1402 -35.28487, 149.14 -35.28479, 149.139.~
# 10 Bunda Street LINESTRING (149.1348 -35.28205, 149.1349 -35.28193, 149.1348 -35.28181, 149.1347 -35.2815, 149.13.~
# # ... with 21 more rows
再次绘制以查看我们正在处理的内容:
tm_shape(my_streets_2)+
tm_lines(lwd = 2, col = 'red')
现在,我们使用 map()
遍历集合中的每条街道,并使用 st_intersection()
.
找到它与其他街道(除了它本身)相交的位置
# now, find which ones intersect
1:nrow(my_streets_2) %>%
map(function(i){
my_streets_2 %>%
filter(
row_number() == i
) %>%
st_intersection(
my_streets_2 %>%
filter(
row_number() != i
)
)
}) %>%
bind_rows %>%
{. ->> my_intersections}
my_intersections
# Simple feature collection with 50 features and 2 fields
# Geometry type: GEOMETRY
# Dimension: XY
# Bounding box: xmin: 149.1261 ymin: -35.28565 xmax: 149.1411 ymax: -35.27228
# Geodetic CRS: WGS 84
# # A tibble: 50 x 3
# name name.1 geometry
# * <chr> <chr> <GEOMETRY [°]>
# 1 Ainslie Avenue Currong Street North MULTIPOINT ((149.1371 -35.27858), (149.1372 -35.2789))
# 2 Ainslie Avenue Doonkuna Street MULTIPOINT ((149.1386 -35.27798), (149.1385 -35.27801), (149.1387 -35.27832), (1~
# 3 Ainslie Avenue Elimatta Street MULTIPOINT ((149.1403 -35.2777), (149.1402 -35.27773), (149.14 -35.27742), (149.~
# 4 Ainslie Avenue Kogarah Lane POINT (149.1365 -35.27917)
# 5 Allambee Street Doonkuna Street POINT (149.1392 -35.27914)
# 6 Amaroo Street Euree Street POINT (149.1393 -35.28565)
# 7 Batman Street Currong Street North POINT (149.1366 -35.27775)
# 8 Batman Street Doonkuna Street POINT (149.1381 -35.27717)
# 9 Batman Street Elimatta Street POINT (149.1396 -35.27657)
# 10 Batman Street Gooreen Street POINT (149.1411 -35.27599)
# # ... with 40 more rows
现在,一些街道与其他街道多次相交(例如两条双车道的交叉),这些街道表示为 MULTIPOINT
几何形状,而其他街道则表示为单个 POINT
几何形状。
为了提取坐标(稍后),我使用 st_cast()
将所有这些分成 POINT
几何图形。请注意,警告消息表明这仅保留每个组的第一个几何图形。为了练习,我将继续,但如果您想要 每个 交叉点,您可能需要修改此命令。
我也只保留独特的街道组合,因为目前出现的重复交叉路口交替出现 name
和 name.1
值。这将特征数量减半。或者,您可以使用 distinct(geometry)
,但这会导致少 4 个特征(我假设来自同一条街道的替代名称,或者可能是具有相同坐标的 3 路或 4 路交叉路口)。
my_intersections %>%
st_cast('POINT') %>%
rowwise %>%
mutate(
streets = list(sort(c(name, name.1)))
) %>%
distinct(streets, .keep_all = TRUE) %>%
select(-streets) %>%
{. ->> my_int_2}
my_int_2
# Simple feature collection with 25 features and 2 fields
# Geometry type: POINT
# Dimension: XY
# Bounding box: xmin: 149.1261 ymin: -35.28565 xmax: 149.1411 ymax: -35.27228
# Geodetic CRS: WGS 84
# # A tibble: 25 x 3
# # Rowwise:
# name name.1 geometry
# <chr> <chr> <POINT [°]>
# 1 Ainslie Avenue Currong Street North (149.1371 -35.27858)
# 2 Ainslie Avenue Doonkuna Street (149.1386 -35.27798)
# 3 Ainslie Avenue Elimatta Street (149.1403 -35.2777)
# 4 Ainslie Avenue Kogarah Lane (149.1365 -35.27917)
# 5 Allambee Street Doonkuna Street (149.1392 -35.27914)
# 6 Amaroo Street Euree Street (149.1393 -35.28565)
# 7 Batman Street Currong Street North (149.1366 -35.27775)
# 8 Batman Street Doonkuna Street (149.1381 -35.27717)
# 9 Batman Street Elimatta Street (149.1396 -35.27657)
# 10 Batman Street Gooreen Street (149.1411 -35.27599)
# # ... with 15 more rows
现在,根据您的要求,我们将 geometry
列替换为 X
和 Y
列(尽管如果您打算对这些数据进行更多的空间操作,那将是明智的做法是保留 geometry
)。这涉及使用 st_coordinates()
提取坐标并使用 bind_cols()
.
将其添加到 sf
集合
my_int_2 %>%
bind_cols(my_int_2 %>% st_coordinates %>% as_tibble) %>%
st_drop_geometry %>%
{. ->> my_int_3}
my_int_3
# # A tibble: 25 x 4
# name name.1 X Y
# * <chr> <chr> <dbl> <dbl>
# 1 Ainslie Avenue Currong Street North 149. -35.3
# 2 Ainslie Avenue Doonkuna Street 149. -35.3
# 3 Ainslie Avenue Elimatta Street 149. -35.3
# 4 Ainslie Avenue Kogarah Lane 149. -35.3
# 5 Allambee Street Doonkuna Street 149. -35.3
# 6 Amaroo Street Euree Street 149. -35.3
# 7 Batman Street Currong Street North 149. -35.3
# 8 Batman Street Doonkuna Street 149. -35.3
# 9 Batman Street Elimatta Street 149. -35.3
# 10 Batman Street Gooreen Street 149. -35.3
# # ... with 15 more rows
我们当然会做一个情节来了解我们的进展情况。
tm_shape(my_streets_2)+
tm_lines(lwd = 2, col = 'red')+
tm_shape(my_int_2)+
tm_dots()
看起来有几个未标记的交叉路口,这些可能是:
MULTIPOINT
中两条街道的交叉点,其中仅保留第一个 POINT
- 同名街道
- 实际上不相交的几何图形(如果必须的话,您可以使用
st_buffer()
和一个小距离来增加交叉点的数量)
希望这能让您走上正确的方向,如果您还没有自己的数据,请查看 osmdata
包。
我有一个看起来像这样的数据集
ID|Cross Street 1|Cross Street 2
1 Beaver St Hanover St
2 Pearl St Wall St
我想得到 2263 个 x 和 y 值
ID|Cross Street 1|Cross Street 2 x |y
1 Beaver St Hanover St 981740.53187633 196247.34676349
2 Pearl St Wall St 982049.05259918 196320.89988222
注意下面的纬度和经度是为了确认位置。
lat |lon
40.705330 -74.009051
40.7055319680708 -74.00793826989502
这是一个使用 osmdata
软件包 (OpenStreetMap) 下载的澳大利亚堪培拉市一小部分数据的示例。
首先,加载库并制作我们感兴趣的区域的边界框:
library(sf)
library(tidyverse)
library(osmdata)
library(tmap)
tmap_mode('view')
# sample data with our focal area
tribble(
~point, ~lat, ~lon,
1, -35.29, 149.12,
2, -35.29, 149.14,
3, -35.27, 149.14,
4, -35.27, 149.12,
) %>%
st_as_sf(
coords = c('lon', 'lat'),
crs = 4326
) %>%
{. ->> my_points}
my_points
# Simple feature collection with 4 features and 1 field
# Geometry type: POINT
# Dimension: XY
# Bounding box: xmin: 149.12 ymin: -35.29 xmax: 149.14 ymax: -35.27
# Geodetic CRS: WGS 84
# # A tibble: 4 x 2
# point geometry
# * <dbl> <POINT [°]>
# 1 1 (149.12 -35.29)
# 2 2 (149.14 -35.29)
# 3 3 (149.14 -35.27)
# 4 4 (149.12 -35.27)
并绘制它以查看面积:
tm_shape(my_points)+
tm_dots()
然后,我们导入OSM数据。我不是这里的专家,我只是在关注 here.
中的示例# import OSM data
st_bbox(my_points) %>%
opq %>%
add_osm_feature(key = 'highway') %>%
osmdata_sf %>%
`[[`('osm_lines') %>%
{. ->> my_streets}
my_streets
# Simple feature collection with 2143 features and 87 fields
# Geometry type: LINESTRING
# Dimension: XY
# Bounding box: xmin: 149.1153 ymin: -35.29224 xmax: 149.1447 ymax: -35.26598
# Geodetic CRS: WGS 84
# First 10 features:
# osm_id name access amenity area bicycle bridge bridge_number building bus bus.lanes busway
# 4018757 4018757 Coranderrk Street <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4189987 4189987 London Circuit <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4189988 4189988 London Circuit <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4204003 4204003 Parkes Way <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4204005 4204005 Parkes Way <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4204006 4204006 Parkes Way <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4368141 4368141 Parkes Way <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4414782 4414782 Watson Street <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4414873 4414873 Gould Street <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4574783 4574783 Macleay Street <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
为了示例,我只保留 residential
条街道(highway
列),然后使用 group_by()
和 summarise()
连接所有段一条同名的街道,然后删除没有名称的街道。
# keep only residential streets
my_streets %>%
filter(
highway == 'residential'
) %>%
select(name) %>%
group_by(name) %>%
summarise %>%
filter(
!is.na(name)
) %>%
{. ->> my_streets_2}
my_streets_2
# Simple feature collection with 31 features and 1 field
# Geometry type: GEOMETRY
# Dimension: XY
# Bounding box: xmin: 149.1201 ymin: -35.28725 xmax: 149.1437 ymax: -35.26933
# Geodetic CRS: WGS 84
# # A tibble: 31 x 2
# name geometry
# * <chr> <GEOMETRY [°]>
# 1 Ainslie Avenue MULTILINESTRING ((149.1355 -35.27924, 149.1356 -35.2792, 149.1357 -35.27915, 149.1358 -35.27909, .~
# 2 Akuna Street LINESTRING (149.1361 -35.28036, 149.136 -35.2804, 149.1356 -35.28057, 149.1353 -35.28068, 149.134.~
# 3 Allambee Street LINESTRING (149.1378 -35.27985, 149.1379 -35.27967, 149.138 -35.27962, 149.1391 -35.27918, 149.13.~
# 4 Allara Street LINESTRING (149.131 -35.28574, 149.131 -35.28568, 149.1311 -35.28563, 149.1312 -35.2855, 149.1316.~
# 5 Amaroo Street MULTILINESTRING ((149.142 -35.28725, 149.1419 -35.28721, 149.1417 -35.2871, 149.1415 -35.28698, 1.~
# 6 Batman Street MULTILINESTRING ((149.1366 -35.27775, 149.1367 -35.2777, 149.138 -35.27721, 149.1381 -35.27717, 1.~
# 7 Binara Street LINESTRING (149.1338 -35.28256, 149.1339 -35.2825, 149.134 -35.28245, 149.1341 -35.28239, 149.134.~
# 8 Boolee Street MULTILINESTRING ((149.1368 -35.2813, 149.1369 -35.28125, 149.1376 -35.281, 149.1381 -35.28079, 14.~
# 9 Booroondara Street LINESTRING (149.1427 -35.28638, 149.1427 -35.28634, 149.1402 -35.28487, 149.14 -35.28479, 149.139.~
# 10 Bunda Street LINESTRING (149.1348 -35.28205, 149.1349 -35.28193, 149.1348 -35.28181, 149.1347 -35.2815, 149.13.~
# # ... with 21 more rows
再次绘制以查看我们正在处理的内容:
tm_shape(my_streets_2)+
tm_lines(lwd = 2, col = 'red')
现在,我们使用 map()
遍历集合中的每条街道,并使用 st_intersection()
.
# now, find which ones intersect
1:nrow(my_streets_2) %>%
map(function(i){
my_streets_2 %>%
filter(
row_number() == i
) %>%
st_intersection(
my_streets_2 %>%
filter(
row_number() != i
)
)
}) %>%
bind_rows %>%
{. ->> my_intersections}
my_intersections
# Simple feature collection with 50 features and 2 fields
# Geometry type: GEOMETRY
# Dimension: XY
# Bounding box: xmin: 149.1261 ymin: -35.28565 xmax: 149.1411 ymax: -35.27228
# Geodetic CRS: WGS 84
# # A tibble: 50 x 3
# name name.1 geometry
# * <chr> <chr> <GEOMETRY [°]>
# 1 Ainslie Avenue Currong Street North MULTIPOINT ((149.1371 -35.27858), (149.1372 -35.2789))
# 2 Ainslie Avenue Doonkuna Street MULTIPOINT ((149.1386 -35.27798), (149.1385 -35.27801), (149.1387 -35.27832), (1~
# 3 Ainslie Avenue Elimatta Street MULTIPOINT ((149.1403 -35.2777), (149.1402 -35.27773), (149.14 -35.27742), (149.~
# 4 Ainslie Avenue Kogarah Lane POINT (149.1365 -35.27917)
# 5 Allambee Street Doonkuna Street POINT (149.1392 -35.27914)
# 6 Amaroo Street Euree Street POINT (149.1393 -35.28565)
# 7 Batman Street Currong Street North POINT (149.1366 -35.27775)
# 8 Batman Street Doonkuna Street POINT (149.1381 -35.27717)
# 9 Batman Street Elimatta Street POINT (149.1396 -35.27657)
# 10 Batman Street Gooreen Street POINT (149.1411 -35.27599)
# # ... with 40 more rows
现在,一些街道与其他街道多次相交(例如两条双车道的交叉),这些街道表示为 MULTIPOINT
几何形状,而其他街道则表示为单个 POINT
几何形状。
为了提取坐标(稍后),我使用 st_cast()
将所有这些分成 POINT
几何图形。请注意,警告消息表明这仅保留每个组的第一个几何图形。为了练习,我将继续,但如果您想要 每个 交叉点,您可能需要修改此命令。
我也只保留独特的街道组合,因为目前出现的重复交叉路口交替出现 name
和 name.1
值。这将特征数量减半。或者,您可以使用 distinct(geometry)
,但这会导致少 4 个特征(我假设来自同一条街道的替代名称,或者可能是具有相同坐标的 3 路或 4 路交叉路口)。
my_intersections %>%
st_cast('POINT') %>%
rowwise %>%
mutate(
streets = list(sort(c(name, name.1)))
) %>%
distinct(streets, .keep_all = TRUE) %>%
select(-streets) %>%
{. ->> my_int_2}
my_int_2
# Simple feature collection with 25 features and 2 fields
# Geometry type: POINT
# Dimension: XY
# Bounding box: xmin: 149.1261 ymin: -35.28565 xmax: 149.1411 ymax: -35.27228
# Geodetic CRS: WGS 84
# # A tibble: 25 x 3
# # Rowwise:
# name name.1 geometry
# <chr> <chr> <POINT [°]>
# 1 Ainslie Avenue Currong Street North (149.1371 -35.27858)
# 2 Ainslie Avenue Doonkuna Street (149.1386 -35.27798)
# 3 Ainslie Avenue Elimatta Street (149.1403 -35.2777)
# 4 Ainslie Avenue Kogarah Lane (149.1365 -35.27917)
# 5 Allambee Street Doonkuna Street (149.1392 -35.27914)
# 6 Amaroo Street Euree Street (149.1393 -35.28565)
# 7 Batman Street Currong Street North (149.1366 -35.27775)
# 8 Batman Street Doonkuna Street (149.1381 -35.27717)
# 9 Batman Street Elimatta Street (149.1396 -35.27657)
# 10 Batman Street Gooreen Street (149.1411 -35.27599)
# # ... with 15 more rows
现在,根据您的要求,我们将 geometry
列替换为 X
和 Y
列(尽管如果您打算对这些数据进行更多的空间操作,那将是明智的做法是保留 geometry
)。这涉及使用 st_coordinates()
提取坐标并使用 bind_cols()
.
sf
集合
my_int_2 %>%
bind_cols(my_int_2 %>% st_coordinates %>% as_tibble) %>%
st_drop_geometry %>%
{. ->> my_int_3}
my_int_3
# # A tibble: 25 x 4
# name name.1 X Y
# * <chr> <chr> <dbl> <dbl>
# 1 Ainslie Avenue Currong Street North 149. -35.3
# 2 Ainslie Avenue Doonkuna Street 149. -35.3
# 3 Ainslie Avenue Elimatta Street 149. -35.3
# 4 Ainslie Avenue Kogarah Lane 149. -35.3
# 5 Allambee Street Doonkuna Street 149. -35.3
# 6 Amaroo Street Euree Street 149. -35.3
# 7 Batman Street Currong Street North 149. -35.3
# 8 Batman Street Doonkuna Street 149. -35.3
# 9 Batman Street Elimatta Street 149. -35.3
# 10 Batman Street Gooreen Street 149. -35.3
# # ... with 15 more rows
我们当然会做一个情节来了解我们的进展情况。
tm_shape(my_streets_2)+
tm_lines(lwd = 2, col = 'red')+
tm_shape(my_int_2)+
tm_dots()
看起来有几个未标记的交叉路口,这些可能是:
MULTIPOINT
中两条街道的交叉点,其中仅保留第一个POINT
- 同名街道
- 实际上不相交的几何图形(如果必须的话,您可以使用
st_buffer()
和一个小距离来增加交叉点的数量)
希望这能让您走上正确的方向,如果您还没有自己的数据,请查看 osmdata
包。