在 R 代码中为路段(纬度经度)分配 ID

Assign ID for Road segment ( Latitude longitude) in R Code

我希望得到您在处理以下数据集方面的想法和建议:

 Start_Latitude    Start_Longitude   End_Latitude    End_Longitude       Date        Avg_Speed
     41.92446         -87.68654       41.93184        -87.67459     2020-06-11 6:00       40
     41.90367         -87.63233       41.91600        -87.61911     2020-06-11 6:00       35
     41.86468         -87.76746       41.82341        -87.69162     2020-06-11 6:00       54
     41.96075         -87.74756       41.76543        -87.67459     2020-06-11 6:00       45

我有代表路段的变量:Start_Latitude、Start_Longitude和End_Latitude、End_Longitude,我有每个路段的平均速度。

我想为以纬度和经度开始,以纬度和经度结束的每个路段分配 Id,以便我可以将平均速度与另一个路段进行比较。

我想要的数据如下所示:

   St_Lat_Long             End_Lat_Long              Date         Avg_Speed     ID
41.92446, -87.6865       41.93184,-87.67459     2020-06-11 6:00       40         1
41.90367,-87.63233       41.91600,-87.61911     2020-06-11 6:00       35         2
41.86468,-87.76746       41.82341,-87.69162     2020-06-11 6:00       54         3
41.96075,-87.74756       41.76543,-87.67459     2020-06-11 6:00       45         4

如何在 R 代码中分配 Id? 我有以下代码为一个空间点分配 ID,该空间点具有 Start_Latitude、Start_Longitude(2 个坐标:

Data$ID <- cumsum(!duplicated(df[1:2]))

 Latitude    Longitude          Date          Avg_Speed    ID
41.92446     -87.68654    2020-06-11 6:00       40          1
41.90367     -87.63233    2020-06-11 6:00       35          2
41.86468     -87.76746    2020-06-11 6:00       54          3
41.96075     -87.74756    2020-06-11 6:00       45          4

另外,是否可以使用4个坐标在地图上绘制所有路段。

这是一种使用 base:

解决 post 第一部分的方法

数据

foo <- tibble::tribble(~Start_Latitude, ~Start_Longitude, ~End_Latitude,    ~End_Longitude, ~Date,           ~Avg_Speed,
                        41.92446,       -87.68654,         41.93184,        -87.67459,      '2020-06-11 6:00',       40,
                        41.90367,       -87.63233,         41.91600,        -87.61911,      '2020-06-11 6:00',       35,
                        41.86468,       -87.76746,         41.82341,        -87.69162,      '2020-06-11 6:00',       54,
                        41.96075,       -87.74756,         41.76543,        -87.67459,      '2020-06-11 6:00',       45)

代码

foo$St_lat_long = paste(foo$Start_Latitude, foo$Start_Longitude, sep = ", ")
foo$End_lat_long = paste(foo$End_Latitude, foo$End_Longitude, sep = ", ")
foo2 <- foo[,c(7,8,5,6)]
foo2$ID <- seq.int(nrow(foo2))

输出

  St_lat_long           End_lat_long          Date            Avg_Speed    ID
  41.92446, -87.68654   41.93184, -87.67459   2020-06-11 6:00        40     1
  41.90367, -87.63233   41.916,   -87.61911   2020-06-11 6:00        35     2
  41.86468, -87.76746   41.82341, -87.69162   2020-06-11 6:00        54     3
  41.96075, -87.74756   41.76543, -87.67459   2020-06-11 6:00        45     4

映射数据

您在 post 中提供了以下数据:

    foo <- tibble::tribble(~Latitude, ~Longitude, ~Date, ~Avg_Speed, ~ID, 
    41.92446, -87.68654, "2020-06-11 6:00", 40, 1, 41.90367, 
    -87.63233, "2020-06-11 6:00", 35, 2, 41.86468, -87.76746, 
    "2020-06-11 6:00", 54, 3, 41.96075, -87.74756, "2020-06-11 6:00", 
    45, 4)
#> # A tibble: 4 x 5
#>   Latitude Longitude Date            Avg_Speed    ID
#>      <dbl>     <dbl> <chr>               <dbl> <dbl>
#> 1     41.9     -87.7 2020-06-11 6:00        40     1
#> 2     41.9     -87.6 2020-06-11 6:00        35     2
#> 3     41.9     -87.8 2020-06-11 6:00        54     3
#> 4     42.0     -87.7 2020-06-11 6:00        45     4

reprex package (v0.3.0)

于 2020-06-21 创建

创建带有纬度和经度的地图

这是一种使用 leaflet 包映射问题最后部分的方法:

library(leaflet) %>%
leaflet(foo) %>% 
addTiles() %>% 
addCircleMarkers(lat = ~Latitude, 
                 lng = ~Longitude, 
                 popup = paste("<b>Date:</b>", foo$Date, "<br>", 
                               "<b>Avergae Speed:</b>", foo$Avg_Speed, "<br>", 
                               "<b>ID:</b>", foo$ID, "<br>"))

reprex package (v0.3.0)

于 2020-06-21 创建

输出

我将交互式传单发布到我的 RPub。 Here is a link

一种使用dplyrstringr的方式:

df %>%
  mutate(St_Lat_Long = str_c(Start_Latitude, ", ", Start_Longitude),
         End_Lat_Long = str_c(End_Latitude, ", ", End_Longitude)) %>%
  select(St_Lat_Long, End_Lat_Long, Date, Avg_Speed) %>%
  distinct(across(ends_with("Lat_Long")), .keep_all=TRUE) %>%
  mutate(ID = row_number())

产量

# A tibble: 4 x 5
  St_Lat_Long         End_Lat_Long        Date            Avg_Speed    ID
  <chr>               <chr>               <chr>               <dbl> <int>
1 41.92446, -87.68654 41.93184, -87.67459 2020-06-11 6:00        40     1
2 41.90367, -87.63233 41.916, -87.61911   2020-06-11 6:00        35     2
3 41.86468, -87.76746 41.82341, -87.69162 2020-06-11 6:00        54     3
4 41.96075, -87.74756 41.76543, -87.67459 2020-06-11 6:00        45     4

您可以使用 unitepaste 合并列,并使用 matchunique 创建唯一的 ID

library(dplyr)
library(tidyr)

df %>%
  unite(St_Lat_Long, Start_Latitude, Start_Longitude, sep = ',') %>%
  unite(End_Lat_Long, End_Latitude, End_Longitude, sep = ',') %>%
  mutate(temp = paste(St_Lat_Long,End_Lat_Long), 
         ID = match(temp, unique(temp))) %>%
  select(-temp)



# A tibble: 4 x 5
#  St_Lat_Long        End_Lat_Long       Date            Avg_Speed    ID
#  <chr>              <chr>              <chr>               <dbl> <int>
#1 41.92446,-87.68654 41.93184,-87.67459 2020-06-11 6:00        40     1
#2 41.90367,-87.63233 41.916,-87.61911   2020-06-11 6:00        35     2
#3 41.86468,-87.76746 41.82341,-87.69162 2020-06-11 6:00        54     3
#4 41.96075,-87.74756 41.76543,-87.67459 2020-06-11 6:00        45     4 

在新的 dplyr 1.0.0 中,您可以使用 cur_group_id 为每个组分配一个唯一的编号。

df %>%
  unite(St_Lat_Long, Start_Latitude, Start_Longitude, sep = ',') %>%
  unite(End_Lat_Long, End_Latitude, End_Longitude, sep = ',') %>%
  group_by(St_Lat_Long, End_Lat_Long) %>%
  mutate(ID = cur_group_id())