如何使用 sf 的 st_distance 函数在检查数据时给出错误来解决这个问题?

How to solve this problem with the st_distance function from sf giving an error when inspecting the data?

我有这个数据:

df <- data.frame (pc_home = c(1042, 2052, NA, 4021, 9423, NA, 1502, 5942),
                  pc_work = c(NA, 2105, NA, 4352, 8984, NA, 1495, 6050),
                  centroid_home = c(c(122239.347627534, 487236.185950724), c(121552.622967901, 487511.344167049), c(NA, NA), c(120168.155075649, 489952.753092173), c(119154.137476474, 489381.429089547), c(NA,NA), c(120723.216386427, 487950.166456445), c(120570.498333358, 487104.749088018))
                  centroid_work = c(c(NA, NA), c(121337.696586159, 486235.561338213), c(NA, NA), c(123060.850070339, 486752.640463608), c(124354.37048732, 487473.329840357), c(NA,NA), c(123171.113425247, 488458.596501631), c(123952.971290978, 489249.568149519))
                  )

质心是使用 shapefile 上的 st_centroid() 计算的。 c(NA,NA) 是缺少用于计算质心的邮政编码的结果。

我使用这个代码:

library(sf)
df <- df %>%
  mutate(dist_hw = st_distance(centroid_home, centroid_work))

没有错误,但检查数据时,我得到了奇怪的结果。在数据框视图中,我看不到任何结果,当我尝试排序(查看是否有任何结果)时,出现此错误:

Error in `stop_subscript()`:
! Can't subset elements that don't exist.
x Locations 4324, 7679, 11034, 13428, 16783, etc. don't exist.
i There are only 3355 elements.

请问是NA造成的还是其他原因?

如果是NA引起的,怎么解决?

我只想计算点之间的距离。

很难使用提供的示例数据。它需要在 sf 样式的数据框中,并且还需要一个 crs。假设您有这些,但难以发布它们,下面的解决方案应该有效。您的 sf 对象需要有两个几何列,看起来您应该有。

使用 st_distance() 应该有效,使用 by_element = T 参数。下面的例子要么直接使用 st_distance(),要么在 dplyr::mutate 中添加一个距离到 sf 数据框的列。

library(sf)
library(tidyverse)

#### Making reproducible data
# get the nc data, make the geometry column a point with st_centroid
nc = st_read(system.file("shape/nc.shp", package="sf")) %>%
  select(NAME) %>% st_centroid()

# jitter the centroid point and add (cbind) as a second geometry column
geo2 <- st_geometry(nc) %>% st_jitter()
nc <- cbind(nc, geo2)
####

# Find the distance between the points, row-by-row
st_distance(nc$geometry,nc$geometry.1, by_element = T) %>% head()
#> Units: [m]
#> [1]  965.8162 2030.5782 1833.3081 1909.5538 1408.7908  820.0569

# or use mutate to add a column to the sf df.
nc %>% mutate(dist = st_distance(geometry, geometry.1, by_element = T))
#> Simple feature collection with 100 features and 2 fields
#> Active geometry column: geometry
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -84.05986 ymin: 34.07671 xmax: -75.8095 ymax: 36.49111
#> Geodetic CRS:  NAD27
#> First 10 features:
#>           NAME                   geometry                 geometry.1
#> 1         Ashe  POINT (-81.49823 36.4314) POINT (-81.49685 36.44001)
#> 2    Alleghany POINT (-81.12513 36.49111) POINT (-81.13681 36.47545)
#> 3        Surry POINT (-80.68573 36.41252) POINT (-80.69163 36.39673)
#> 4    Currituck POINT (-76.02719 36.40714) POINT (-76.02305 36.39029)
#> 5  Northampton POINT (-77.41046 36.42236) POINT (-77.40909 36.40974)
#> 6     Hertford POINT (-76.99472 36.36142) POINT (-76.98777 36.36623)
#> 7       Camden POINT (-76.23402 36.40122)  POINT (-76.23969 36.4181)
#> 8        Gates POINT (-76.70446 36.44428) POINT (-76.70953 36.45603)
#> 9       Warren POINT (-78.11042 36.39693) POINT (-78.11619 36.38541)
#> 10      Stokes POINT (-80.23429 36.40042) POINT (-80.24365 36.39904)
#>             dist
#> 1   965.8162 [m]
#> 2  2030.5782 [m]
#> 3  1833.3081 [m]
#> 4  1909.5538 [m]
#> 5  1408.7908 [m]
#> 6   820.0569 [m]
#> 7  1944.8192 [m]
#> 8  1382.9058 [m]
#> 9  1381.7946 [m]
#> 10  851.1106 [m]

reprex package (v2.0.1)

于 2022-04-13 创建