按组对每一行和后续行应用一个函数
Apply a function to each row and the subsequent row by group
我有一个 sf 数据框,其中包含标记沿许多 单向 街道的交叉路口位置的点。除了几何列之外,一列包含街道名称,另一列包含十字路口在单行道上的相对位置。
下面是一个玩具示例。第一行是Arch St.的第一个路口,第二行是Arch St.的第二个路口,依此类推
library(sf)
intersections <- structure(list(street = c("ARCH ST", "ARCH ST", "ARCH ST", "SANSOM ST",
"SANSOM ST", "SANSOM ST"), number = c(1L, 2L, 3L, 1L, 2L, 3L),
geometry = structure(list(structure(c(2699665.2606043, 236074.947200272
), class = c("XY", "POINT", "sfg")), structure(c(2699402.74765515,
236109.729280198), class = c("XY", "POINT", "sfg")), structure(c(2699202.95996668,
236136.613760229), class = c("XY", "POINT", "sfg")), structure(c(2699431.38476158,
234437.663731016), class = c("XY", "POINT", "sfg")), structure(c(2699162.09261096,
234476.514355583), class = c("XY", "POINT", "sfg")), structure(c(2697100.77148795,
234809.605567052), class = c("XY", "POINT", "sfg"))), precision = 0, bbox = structure(c(xmin = 2697100.77148795,
ymin = 234437.663731016, xmax = 2699665.2606043, ymax = 236136.613760229
), class = "bbox"), crs = structure(list(epsg = 2272L, proj4string = "+proj=lcc +lat_1=40.96666666666667 +lat_2=39.93333333333333 +lat_0=39.33333333333334 +lon_0=-77.75 +x_0=600000 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=us-ft +no_defs"), class = "crs"), n_empty = 0L, class = c("sfc_POINT",
"sfc"))), row.names = c(NA, -6L), class = c("sf", "tbl_df",
"tbl", "data.frame"), sf_column = "geometry", agr = structure(c(street = NA_integer_,
number = NA_integer_), class = "factor", .Label = c("constant",
"aggregate", "identity")))
> intersections
Simple feature collection with 6 features and 2 fields
geometry type: POINT
dimension: XY
bbox: xmin: 2697101 ymin: 234437.7 xmax: 2699665 ymax: 236136.6
epsg (SRID): 2272
proj4string: +proj=lcc +lat_1=40.96666666666667 +lat_2=39.93333333333333 +lat_0=39.33333333333334 +lon_0=-77.75 +x_0=600000 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=us-ft +no_defs
# A tibble: 6 x 3
street number geometry
<chr> <int> <POINT [US_survey_foot]>
1 ARCH ST 1 (2699665 236074.9)
2 ARCH ST 2 (2699403 236109.7)
3 ARCH ST 3 (2699203 236136.6)
4 SANSOM ST 1 (2699431 234437.7)
5 SANSOM ST 2 (2699162 234476.5)
6 SANSOM ST 3 (2697101 234809.6)
使用 mapsapi
包中的 mp_matrix()
和 mp_get_matrix()
,我想添加一列,显示从该街道上的每个十字路口到下一个十字路口的行驶时间(除了对于最后一个交叉点,它得到一个 NA)。
理想情况下,它应该如下所示:
street number travel_time_sec geometry
1 ARCH ST 1 210 POINT (2699665 236074.9)
2 ARCH ST 2 180 POINT (2699403 236109.7)
3 ARCH ST 3 NA POINT (2699203 236136.6)
4 SANSOM ST 1 150 POINT (2699431 234437.7)
5 SANSOM ST 2 175 POINT (2699162 234476.5)
6 SANSOM ST 3 NA POINT (2697101 234809.6)
我如何按组(即街道)遍历 sf 数据框中的行,告诉每一行执行操作,该组中的下一行填充新列,并且 return 如果不存在这样的下一行则为 NA?
最后,由于 mp_matrix()
调用 Google 地图 API,这需要花钱,所以请改用 sf
中的 st_distance()
函数来生成以下。
street number travel_distance geometry
1 ARCH ST 1 576 POINT (2699665 236074.9)
2 ARCH ST 2 397 POINT (2699403 236109.7)
3 ARCH ST 3 NA POINT (2699203 236136.6)
4 SANSOM ST 1 410 POINT (2699431 234437.7)
5 SANSOM ST 2 440 POINT (2699162 234476.5)
6 SANSOM ST 3 NA POINT (2697101 234809.6)
非常感谢您的帮助。
我在玩你的例子,但我无法用 st_distance
函数得到相同的 travel distance
。
st_distance(intersections$geometry[1], intersections$geometry[2])
Units: [US_survey_foot]
[,1]
[1,] 264.8072
通过行本身的循环或矢量化操作可以用这段代码完成
# used librarys
library(units)
library(tidyverse)
library(sf)
# find distance function
find_Distance <- function(x) {
# create lead list
x_lead <- x[2:length(x)]
# create distance matrix
distance_matrix <- st_distance(x, x_lead)
# diagonal of the distance matrix is your desired output, fill last entry with NA and
# unit
c(diag(distance_matrix), set_units(NA, "US_survey_foot"))
}
# group by street and calculate distance
intersections <- group_by(intersections, street) %>%
mutate(travel_distance = find_Distance(geometry))
# if needed, set unit of travel distance
units(intersections$travel_distance) <- as_units("US_survey_foot")
我有一个 sf 数据框,其中包含标记沿许多 单向 街道的交叉路口位置的点。除了几何列之外,一列包含街道名称,另一列包含十字路口在单行道上的相对位置。
下面是一个玩具示例。第一行是Arch St.的第一个路口,第二行是Arch St.的第二个路口,依此类推
library(sf)
intersections <- structure(list(street = c("ARCH ST", "ARCH ST", "ARCH ST", "SANSOM ST",
"SANSOM ST", "SANSOM ST"), number = c(1L, 2L, 3L, 1L, 2L, 3L),
geometry = structure(list(structure(c(2699665.2606043, 236074.947200272
), class = c("XY", "POINT", "sfg")), structure(c(2699402.74765515,
236109.729280198), class = c("XY", "POINT", "sfg")), structure(c(2699202.95996668,
236136.613760229), class = c("XY", "POINT", "sfg")), structure(c(2699431.38476158,
234437.663731016), class = c("XY", "POINT", "sfg")), structure(c(2699162.09261096,
234476.514355583), class = c("XY", "POINT", "sfg")), structure(c(2697100.77148795,
234809.605567052), class = c("XY", "POINT", "sfg"))), precision = 0, bbox = structure(c(xmin = 2697100.77148795,
ymin = 234437.663731016, xmax = 2699665.2606043, ymax = 236136.613760229
), class = "bbox"), crs = structure(list(epsg = 2272L, proj4string = "+proj=lcc +lat_1=40.96666666666667 +lat_2=39.93333333333333 +lat_0=39.33333333333334 +lon_0=-77.75 +x_0=600000 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=us-ft +no_defs"), class = "crs"), n_empty = 0L, class = c("sfc_POINT",
"sfc"))), row.names = c(NA, -6L), class = c("sf", "tbl_df",
"tbl", "data.frame"), sf_column = "geometry", agr = structure(c(street = NA_integer_,
number = NA_integer_), class = "factor", .Label = c("constant",
"aggregate", "identity")))
> intersections
Simple feature collection with 6 features and 2 fields
geometry type: POINT
dimension: XY
bbox: xmin: 2697101 ymin: 234437.7 xmax: 2699665 ymax: 236136.6
epsg (SRID): 2272
proj4string: +proj=lcc +lat_1=40.96666666666667 +lat_2=39.93333333333333 +lat_0=39.33333333333334 +lon_0=-77.75 +x_0=600000 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=us-ft +no_defs
# A tibble: 6 x 3
street number geometry
<chr> <int> <POINT [US_survey_foot]>
1 ARCH ST 1 (2699665 236074.9)
2 ARCH ST 2 (2699403 236109.7)
3 ARCH ST 3 (2699203 236136.6)
4 SANSOM ST 1 (2699431 234437.7)
5 SANSOM ST 2 (2699162 234476.5)
6 SANSOM ST 3 (2697101 234809.6)
使用 mapsapi
包中的 mp_matrix()
和 mp_get_matrix()
,我想添加一列,显示从该街道上的每个十字路口到下一个十字路口的行驶时间(除了对于最后一个交叉点,它得到一个 NA)。
理想情况下,它应该如下所示:
street number travel_time_sec geometry
1 ARCH ST 1 210 POINT (2699665 236074.9)
2 ARCH ST 2 180 POINT (2699403 236109.7)
3 ARCH ST 3 NA POINT (2699203 236136.6)
4 SANSOM ST 1 150 POINT (2699431 234437.7)
5 SANSOM ST 2 175 POINT (2699162 234476.5)
6 SANSOM ST 3 NA POINT (2697101 234809.6)
我如何按组(即街道)遍历 sf 数据框中的行,告诉每一行执行操作,该组中的下一行填充新列,并且 return 如果不存在这样的下一行则为 NA?
最后,由于 mp_matrix()
调用 Google 地图 API,这需要花钱,所以请改用 sf
中的 st_distance()
函数来生成以下。
street number travel_distance geometry
1 ARCH ST 1 576 POINT (2699665 236074.9)
2 ARCH ST 2 397 POINT (2699403 236109.7)
3 ARCH ST 3 NA POINT (2699203 236136.6)
4 SANSOM ST 1 410 POINT (2699431 234437.7)
5 SANSOM ST 2 440 POINT (2699162 234476.5)
6 SANSOM ST 3 NA POINT (2697101 234809.6)
非常感谢您的帮助。
我在玩你的例子,但我无法用 st_distance
函数得到相同的 travel distance
。
st_distance(intersections$geometry[1], intersections$geometry[2])
Units: [US_survey_foot]
[,1]
[1,] 264.8072
通过行本身的循环或矢量化操作可以用这段代码完成
# used librarys
library(units)
library(tidyverse)
library(sf)
# find distance function
find_Distance <- function(x) {
# create lead list
x_lead <- x[2:length(x)]
# create distance matrix
distance_matrix <- st_distance(x, x_lead)
# diagonal of the distance matrix is your desired output, fill last entry with NA and
# unit
c(diag(distance_matrix), set_units(NA, "US_survey_foot"))
}
# group by street and calculate distance
intersections <- group_by(intersections, street) %>%
mutate(travel_distance = find_Distance(geometry))
# if needed, set unit of travel distance
units(intersections$travel_distance) <- as_units("US_survey_foot")