有没有办法根据 R 中的两个街道名称对交叉路口进行地理编码?

Is there a way to geocode intersection based on two street names in R?

我有一个看起来像这样的数据集

ID|Cross Street 1|Cross Street 2
 1 Beaver St      Hanover St
 2 Pearl St       Wall St

我想得到 2263 个 x 和 y 值

ID|Cross Street 1|Cross Street 2 x              |y
 1 Beaver St      Hanover St     981740.53187633 196247.34676349
 2 Pearl St       Wall St        982049.05259918 196320.89988222

注意下面的纬度和经度是为了确认位置。

lat              |lon
40.705330         -74.009051
40.7055319680708  -74.00793826989502

这是一个使用 osmdata 软件包 (OpenStreetMap) 下载的澳大利亚堪培拉市一小部分数据的示例。

首先,加载库并制作我们感兴趣的区域的边界框:

library(sf)
library(tidyverse)
library(osmdata)
library(tmap)
tmap_mode('view')

# sample data with our focal area
tribble(
  ~point, ~lat, ~lon, 
  1, -35.29, 149.12, 
  2, -35.29, 149.14, 
  3, -35.27, 149.14, 
  4, -35.27, 149.12, 
) %>% 
  st_as_sf(
    coords = c('lon', 'lat'), 
    crs = 4326
  ) %>% 
  {. ->> my_points}

my_points

# Simple feature collection with 4 features and 1 field
# Geometry type: POINT
# Dimension:     XY
# Bounding box:  xmin: 149.12 ymin: -35.29 xmax: 149.14 ymax: -35.27
# Geodetic CRS:  WGS 84
# # A tibble: 4 x 2
# point        geometry
# * <dbl>     <POINT [°]>
# 1     1 (149.12 -35.29)
# 2     2 (149.14 -35.29)
# 3     3 (149.14 -35.27)
# 4     4 (149.12 -35.27)

并绘制它以查看面积:

tm_shape(my_points)+
  tm_dots()

然后,我们导入OSM数据。我不是这里的专家,我只是在关注 here.

中的示例
# import OSM data
st_bbox(my_points) %>% 
  opq %>% 
  add_osm_feature(key = 'highway') %>% 
  osmdata_sf %>% 
  `[[`('osm_lines') %>% 
  {. ->> my_streets}

my_streets

# Simple feature collection with 2143 features and 87 fields
# Geometry type: LINESTRING
# Dimension:     XY
# Bounding box:  xmin: 149.1153 ymin: -35.29224 xmax: 149.1447 ymax: -35.26598
# Geodetic CRS:  WGS 84
# First 10 features:
#          osm_id              name access amenity area bicycle bridge bridge_number building  bus bus.lanes busway
# 4018757 4018757 Coranderrk Street   <NA>    <NA> <NA>    <NA>   <NA>          <NA>     <NA> <NA>      <NA>   <NA>
# 4189987 4189987    London Circuit   <NA>    <NA> <NA>    <NA>   <NA>          <NA>     <NA> <NA>      <NA>   <NA>
# 4189988 4189988    London Circuit   <NA>    <NA> <NA>    <NA>   <NA>          <NA>     <NA> <NA>      <NA>   <NA>
# 4204003 4204003        Parkes Way   <NA>    <NA> <NA>    <NA>   <NA>          <NA>     <NA> <NA>      <NA>   <NA>
# 4204005 4204005        Parkes Way   <NA>    <NA> <NA>    <NA>   <NA>          <NA>     <NA> <NA>      <NA>   <NA>
# 4204006 4204006        Parkes Way   <NA>    <NA> <NA>    <NA>   <NA>          <NA>     <NA> <NA>      <NA>   <NA>
# 4368141 4368141        Parkes Way   <NA>    <NA> <NA>    <NA>   <NA>          <NA>     <NA> <NA>      <NA>   <NA>
# 4414782 4414782     Watson Street   <NA>    <NA> <NA>    <NA>   <NA>          <NA>     <NA> <NA>      <NA>   <NA>
# 4414873 4414873      Gould Street   <NA>    <NA> <NA>    <NA>   <NA>          <NA>     <NA> <NA>      <NA>   <NA>
# 4574783 4574783    Macleay Street   <NA>    <NA> <NA>    <NA>   <NA>          <NA>     <NA> <NA>      <NA>   <NA>

为了示例,我只保留 residential 条街道(highway 列),然后使用 group_by()summarise() 连接所有段一条同名的街道,然后删除没有名称的街道。

# keep only residential streets
my_streets %>% 
  filter(
    highway == 'residential'
  ) %>% 
  select(name) %>% 
  group_by(name) %>% 
  summarise %>%
  filter(
    !is.na(name)
  ) %>% 
  {. ->> my_streets_2}

my_streets_2

# Simple feature collection with 31 features and 1 field
# Geometry type: GEOMETRY
# Dimension:     XY
# Bounding box:  xmin: 149.1201 ymin: -35.28725 xmax: 149.1437 ymax: -35.26933
# Geodetic CRS:  WGS 84
# # A tibble: 31 x 2
# name                                                                                                          geometry
#   * <chr>                                                                                                   <GEOMETRY [°]>
#   1 Ainslie Avenue     MULTILINESTRING ((149.1355 -35.27924, 149.1356 -35.2792, 149.1357 -35.27915, 149.1358 -35.27909, .~
#   2 Akuna Street       LINESTRING (149.1361 -35.28036, 149.136 -35.2804, 149.1356 -35.28057, 149.1353 -35.28068, 149.134.~
#   3 Allambee Street    LINESTRING (149.1378 -35.27985, 149.1379 -35.27967, 149.138 -35.27962, 149.1391 -35.27918, 149.13.~
#   4 Allara Street      LINESTRING (149.131 -35.28574, 149.131 -35.28568, 149.1311 -35.28563, 149.1312 -35.2855, 149.1316.~
#   5 Amaroo Street      MULTILINESTRING ((149.142 -35.28725, 149.1419 -35.28721, 149.1417 -35.2871, 149.1415 -35.28698, 1.~
#   6 Batman Street      MULTILINESTRING ((149.1366 -35.27775, 149.1367 -35.2777, 149.138 -35.27721, 149.1381 -35.27717, 1.~
#   7 Binara Street      LINESTRING (149.1338 -35.28256, 149.1339 -35.2825, 149.134 -35.28245, 149.1341 -35.28239, 149.134.~
#   8 Boolee Street      MULTILINESTRING ((149.1368 -35.2813, 149.1369 -35.28125, 149.1376 -35.281, 149.1381 -35.28079, 14.~
#   9 Booroondara Street LINESTRING (149.1427 -35.28638, 149.1427 -35.28634, 149.1402 -35.28487, 149.14 -35.28479, 149.139.~
#   10 Bunda Street       LINESTRING (149.1348 -35.28205, 149.1349 -35.28193, 149.1348 -35.28181, 149.1347 -35.2815, 149.13.~
#   # ... with 21 more rows

再次绘制以查看我们正在处理的内容:

tm_shape(my_streets_2)+
  tm_lines(lwd = 2, col = 'red')

现在,我们使用 map() 遍历集合中的每条街道,并使用 st_intersection().

找到它与其他街道(除了它本身)相交的位置
# now, find which ones intersect
1:nrow(my_streets_2) %>% 
  map(function(i){
    
    my_streets_2 %>% 
      filter(
        row_number() == i
      ) %>% 
      st_intersection(
        my_streets_2 %>% 
          filter(
            row_number() != i
          )
      )
      
    }) %>% 
  bind_rows %>% 
  {. ->> my_intersections}

my_intersections

# Simple feature collection with 50 features and 2 fields
# Geometry type: GEOMETRY
# Dimension:     XY
# Bounding box:  xmin: 149.1261 ymin: -35.28565 xmax: 149.1411 ymax: -35.27228
# Geodetic CRS:  WGS 84
# # A tibble: 50 x 3
# name            name.1                                                                                        geometry
# * <chr>           <chr>                                                                                   <GEOMETRY [°]>
# 1 Ainslie Avenue  Currong Street North                            MULTIPOINT ((149.1371 -35.27858), (149.1372 -35.2789))
# 2 Ainslie Avenue  Doonkuna Street      MULTIPOINT ((149.1386 -35.27798), (149.1385 -35.27801), (149.1387 -35.27832), (1~
# 3 Ainslie Avenue  Elimatta Street      MULTIPOINT ((149.1403 -35.2777), (149.1402 -35.27773), (149.14 -35.27742), (149.~
# 4 Ainslie Avenue  Kogarah Lane                                                                POINT (149.1365 -35.27917)
# 5 Allambee Street Doonkuna Street                                                             POINT (149.1392 -35.27914)
# 6 Amaroo Street   Euree Street                                                                POINT (149.1393 -35.28565)
# 7 Batman Street   Currong Street North                                                        POINT (149.1366 -35.27775)
# 8 Batman Street   Doonkuna Street                                                             POINT (149.1381 -35.27717)
# 9 Batman Street   Elimatta Street                                                             POINT (149.1396 -35.27657)
# 10 Batman Street   Gooreen Street                                                              POINT (149.1411 -35.27599)
# # ... with 40 more rows

现在,一些街道与其他街道多次相交(例如两条双车道的交叉),这些街道表示为 MULTIPOINT 几何形状,而其他街道则表示为单个 POINT 几何形状。

为了提取坐标(稍后),我使用 st_cast() 将所有这些分成 POINT 几何图形。请注意,警告消息表明这仅保留每个组的第一个几何图形。为了练习,我将继续,但如果您想要 每个 交叉点,您可能需要修改此命令。

我也只保留独特的街道组合,因为目前出现的重复交叉路口交替出现 namename.1 值。这将特征数量减半。或者,您可以使用 distinct(geometry),但这会导致少 4 个特征(我假设来自同一条街道的替代名称,或者可能是具有相同坐标的 3 路或 4 路交叉路口)。

my_intersections %>% 
  st_cast('POINT') %>%
  rowwise %>%
  mutate(
    streets = list(sort(c(name, name.1)))
  ) %>%
  distinct(streets, .keep_all = TRUE) %>%
  select(-streets) %>% 
  {. ->> my_int_2}

my_int_2

# Simple feature collection with 25 features and 2 fields
# Geometry type: POINT
# Dimension:     XY
# Bounding box:  xmin: 149.1261 ymin: -35.28565 xmax: 149.1411 ymax: -35.27228
# Geodetic CRS:  WGS 84
# # A tibble: 25 x 3
# # Rowwise: 
#   name            name.1                           geometry
#   <chr>           <chr>                         <POINT [°]>
# 1 Ainslie Avenue  Currong Street North (149.1371 -35.27858)
# 2 Ainslie Avenue  Doonkuna Street      (149.1386 -35.27798)
# 3 Ainslie Avenue  Elimatta Street       (149.1403 -35.2777)
# 4 Ainslie Avenue  Kogarah Lane         (149.1365 -35.27917)
# 5 Allambee Street Doonkuna Street      (149.1392 -35.27914)
# 6 Amaroo Street   Euree Street         (149.1393 -35.28565)
# 7 Batman Street   Currong Street North (149.1366 -35.27775)
# 8 Batman Street   Doonkuna Street      (149.1381 -35.27717)
# 9 Batman Street   Elimatta Street      (149.1396 -35.27657)
# 10 Batman Street   Gooreen Street       (149.1411 -35.27599)
# # ... with 15 more rows

现在,根据您的要求,我们将 geometry 列替换为 XY 列(尽管如果您打算对这些数据进行更多的空间操作,那将是明智的做法是保留 geometry)。这涉及使用 st_coordinates() 提取坐标并使用 bind_cols().

将其添加到 sf 集合
my_int_2 %>% 
  bind_cols(my_int_2 %>% st_coordinates %>% as_tibble) %>% 
  st_drop_geometry %>% 
  {. ->> my_int_3}

my_int_3

# # A tibble: 25 x 4
# name            name.1                   X     Y
# * <chr>           <chr>                <dbl> <dbl>
# 1 Ainslie Avenue  Currong Street North  149. -35.3
# 2 Ainslie Avenue  Doonkuna Street       149. -35.3
# 3 Ainslie Avenue  Elimatta Street       149. -35.3
# 4 Ainslie Avenue  Kogarah Lane          149. -35.3
# 5 Allambee Street Doonkuna Street       149. -35.3
# 6 Amaroo Street   Euree Street          149. -35.3
# 7 Batman Street   Currong Street North  149. -35.3
# 8 Batman Street   Doonkuna Street       149. -35.3
# 9 Batman Street   Elimatta Street       149. -35.3
# 10 Batman Street   Gooreen Street        149. -35.3
# # ... with 15 more rows

我们当然会做一个情节来了解我们的进展情况。

tm_shape(my_streets_2)+
  tm_lines(lwd = 2, col = 'red')+
  tm_shape(my_int_2)+
  tm_dots()

看起来有几个未标记的交叉路口,这些可能是:

  • MULTIPOINT 中两条街道的交叉点,其中仅保留第一个 POINT
  • 同名街道
  • 实际上不相交的几何图形(如果必须的话,您可以使用 st_buffer() 和一个小距离来增加交叉点的数量)

希望这能让您走上正确的方向,如果您还没有自己的数据,请查看 osmdata 包。