使用 R 将 SpatialPoints 与 SpatialLines 匹配/连接

Matching / Joining SpatialPoints with SpatialLines using R

问题

我有 set of shapefiles 开罗的不同巴士路线(每条路线有 2 趟)。我想使用 R:

检查他们是否经过某些站点
Stops_Data <- readOGR("Whosebug Data/","Stops_Data")
Trips_Data <- readOGR("Whosebug Data/","Trips_Data")

plot(Trips_Data)
points(Stops_Data,  col = "red")

我尝试使用 over() 来匹配它们。首先,我确保投影是相同的。

proj4string(Trips_Data) <- proj4string(Stops_Data)
over(Stops_Data, Trips_Data)

这只会给出 NA。


期望输出

目标是获得一个 table 将每个行程与其经过的站点相匹配,并使用以下形式创建一个 table:

      trip_id      stop_id stop_name stop_seque
45 CTA_1021_O PaM_1031_EMB    Embaba          1
46 CTA_1021_O PaM_1039_KKT   Kit Kat          2
47 CTA_1021_O PaM_1009_AGZ    Agouza          3
48 CTA_1021_O PaM_2004_ISA     Isaaf          4
49 CTA_1021_O PaM_1059_RAM    Ramses          5
50 CTA_1021_O PaM_1035_GMR    Ghamra          6

可能的解决方案?

我什至不相信点与线的匹配是可能的,因为坐标几乎总是有轻微的勾号。线条不像多边形那样有回旋余地 space。

因此,我在想有没有办法将 SpatialLinesDataFrame 转换为多边形?考虑在每条公交路线周围制作一个宽度为 100 米的多边形,从而自然地包含所有停止。

如果“100m”值是 near 的定义,那么我们可以使用 rgeos::gDistance() 但我们需要将空间对象重新投影为单位为米与度的投影:

library(rgdal)
library(rgeos)
library(sp)
library(dplyr)
library(tidyr)

route_stops <- readOGR("Stops_Data.shp", "Stops_Data", stringsAsFactors=FALSE)
c <- readOGR("Trips_Data.shp", "Trips_Data", stringsAsFactors=FALSE)

# need meters vs degrees
prj <- "+proj=aea +lat_1=29.9 +lat_2=30.2 +lon_0=30.9"

route_stops <- SpatialPointsDataFrame(spTransform(route_stops, CRS(prj)), route_stops@data)
trips <- SpatialLinesDataFrame(spTransform(trips, CRS(prj)), trips@data)

# get a bird's eye view
plot(trips)
plot(route_stops, add=TRUE)

现在,我们来看看哪些点在 100 米以内:

near <- suppressWarnings(gDistance(route_stops, trips, byid=TRUE))
near <- apply(near, 1, `<=`, 100)

# make it easier to work with later on and include other data elements
colnames(near) <- trips$route_id
near <- data.frame(near)
near$stop_id <- route_stops$stop_id
near$stop_name <- route_stops$stop_name

# see what we found
plot(trips)
plot(route_stops[apply(near[,1:4], 1, any),], add=TRUE)

现在,我们只需要按路线查看点,所以我们进行一些数据整理,然后添加序列 #(这可能是错误的,因为我不知道这些点是否按顺序排列,所以您可以撸起袖子来干这个了):

gather(near, route, stop, -stop_id, -stop_name) %>% 
  group_by(route) %>% 
  filter(stop) %>% 
  mutate(stop_seq=1:n()) %>% 
  select(trip_id=route, stop_id, stop_name, stop_seq, -stop) %>% 
  ungroup() -> desired_output

print(desired_output, n=44)

## Source: local data frame [44 x 4]
## 
##       trip_id      stop_id                    stop_name stop_seq
##         <chr>        <chr>                        <chr>    <int>
## 1    CTA_1021 PaM_1059_RAM                       Ramses        1
## 2    CTA_1021 PaM_1035_GMR                       Ghamra        2
## 3    CTA_1021 PaM_1064_ZAM                      Zamalek        3
## 4    CTA_1021 PaM_1054_MUN               Mustafa Nahhas        4
## 5    CTA_1021 PaM_1064_SSS               Salah Salem St        5
## 6    CTA_1021 PaM_1057_RAB             Rabaa El-Adaweya        6
## 7    CTA_1021 PaM_1030_TAY                    Eltayaran        7
## 8    CTA_1021 PaM_1004_EIT                 8th District        8
## 9    CTA_1021 PaM_1048_MNH                       Manhal        9
## 10   CTA_1021 PaM_1031_EMB                       Embaba       10
## 11   CTA_1021 PaM_1039_KKT                      Kit Kat       11
## 12   CTA_1021 PaM_1073_TAB                        Tabba       12
## 13   CTA_1021 PaM_1043_MAA                       Ma3rad       13
## 14   CTA_1021 PaM_1072_TAS                 Ta'men Sehhy       14
## 15   CTA_1021 PaM_2004_ISA                        Isaaf       15
## 16   CTA_1020 PaM_1042_LEB               Lebanon Square        1
## 17   CTA_1020 PaM_1059_RAM                       Ramses        2
## 18   CTA_1020 PaM_1007_ART Abdel el Monem Riad / Tahrir        3
## 19   CTA_1020 PaM_1070_SFX                Sphinx Square        4
## 20   CTA_1020 PaM_1009_AGZ                       Agouza        5
## 21   CTA_1020 PaM_2004_ISA                        Isaaf        6
## 22   CTA_1020 PaM_2005_AHM                  Ahmed Helmy        7
## 23 CTA_1021.1 PaM_1059_RAM                       Ramses        1
## 24 CTA_1021.1 PaM_1035_GMR                       Ghamra        2
## 25 CTA_1021.1 PaM_1064_ZAM                      Zamalek        3
## 26 CTA_1021.1 PaM_1054_MUN               Mustafa Nahhas        4
## 27 CTA_1021.1 PaM_1064_SSS               Salah Salem St        5
## 28 CTA_1021.1 PaM_1057_RAB             Rabaa El-Adaweya        6
## 29 CTA_1021.1 PaM_1030_TAY                    Eltayaran        7
## 30 CTA_1021.1 PaM_1004_EIT                 8th District        8
## 31 CTA_1021.1 PaM_1048_MNH                       Manhal        9
## 32 CTA_1021.1 PaM_1031_EMB                       Embaba       10
## 33 CTA_1021.1 PaM_1039_KKT                      Kit Kat       11
## 34 CTA_1021.1 PaM_1073_TAB                        Tabba       12
## 35 CTA_1021.1 PaM_1043_MAA                       Ma3rad       13
## 36 CTA_1021.1 PaM_1072_TAS                 Ta'men Sehhy       14
## 37 CTA_1021.1 PaM_2004_ISA                        Isaaf       15
## 38 CTA_1020.1 PaM_1042_LEB               Lebanon Square        1
## 39 CTA_1020.1 PaM_1059_RAM                       Ramses        2
## 40 CTA_1020.1 PaM_1007_ART Abdel el Monem Riad / Tahrir        3
## 41 CTA_1020.1 PaM_1070_SFX                Sphinx Square        4
## 42 CTA_1020.1 PaM_1009_AGZ                       Agouza        5
## 43 CTA_1020.1 PaM_2004_ISA                        Isaaf        6
## 44 CTA_1020.1 PaM_2005_AHM                  Ahmed Helmy        7

使用缓冲区方法,您可以使用 rgeos 来缓冲行:

library(sp)
library(rgdal)
library(rgeos)

Stops_Data <- readOGR("Whosebug Data","Stops_Data")
Trips_Data <- readOGR("Whosebug Data","Trips_Data")
str(Trips_Data@data)
str(Stops_Data@data)`


plot(Trips_Data)
points(Stops_Data,  col = "red")

# Convert to UTM or other equal area projection to use buffer in metres
utm36N <- "+proj=utm +zone=36 +north +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0"

Stops_Data.utm <- spTransform(Stops_Data, CRS(utm36N))
Trips_Data.utm <- spTransform(Trips_Data, CRS(utm36N))

Trips_Buff <- gBuffer(Trips_Data.utm, byid = T, width = 100) # use different width here

ov <- over(Stops_Data.utm, Trips_Buff)

# Get the indices of the stops coinciding
matches <- which(!is.na(ov$route_id))

# bind route ids and matches from stops data
results <- cbind(route_id = ov[matches, 1], Stops_Data.utm@data[matches,1:2])
head(results)
# route_id      stop_id      stop_name
# 2  CTA_1020 PaM_1042_LEB Lebanon Square
# 4  CTA_1021 PaM_1059_RAM         Ramses
# 10 CTA_1021 PaM_1035_GMR         Ghamra
# 12 CTA_1021 PaM_1064_ZAM        Zamalek
# 16 CTA_1021 PaM_1054_MUN Mustafa Nahhas
# 31 CTA_1021 PaM_1064_SSS Salah Salem St

## plot routes by colour (use factor level)
plot(Trips_Buff, col = Trips_Data.utm$route_id)
## plot overlapping in blue
plot(Stops_Data.utm, add = T, pch = 20, col = ifelse(is.na(ov$route_id), "black", "blue"))

站点数据中没有 stops_seque 列。如果您指的是每次巴士行程沿途的停靠顺序,那么您需要进行路线分析,并为每次巴士行程定义方向。