计算 sfc_multipoint 对象中的点数

Count number of points in sfc_multipoint object

我想按 sfc_multipoint 的大小过滤 sfc_multipoint 数据,即每行中包含的 points/coordinate 对的数量。假设我只想保留以下数据集中不超过 3 个点的行,基于 mapview::breweries:

library(mapview)
library(sf)
library(tidyverse)

pts <- breweries %>% 
  group_by(zipcode) %>% 
  summarise()

为此,我需要知道每个 sfc_multipoint 中的点数。理论上我可以通过 as("spatial") 导出坐标并计算行数,但这不仅难看,而且对于任何实际大小的数据集来说也不可行。有什么建议吗?

您可以尝试 dplyr::count()summarize(n = n()) 读取给定 zipcode 中的行数,但 breweries 数据集似乎有一些重复项,因此这可能误导。

breweries %>% 
  count(zipcode)

#----------
  zipcode n                       geometry
1   90562 1      POINT (11.15795 49.53495)
2   90614 1       POINT (10.85194 49.4208)
3   90763 1      POINT (10.99625 49.44171)
4   91054 3 MULTIPOINT ((11.00901 49.59...
5   91097 2 MULTIPOINT ((10.76099 49.59...

或者只有唯一点(注意 91054 中的变化)

breweries %>%
  distinct(zipcode, geometry) %>%
  count(zipcode)

#-----
  zipcode n                       geometry
1   90562 1      POINT (11.15795 49.53495)
2   90614 1       POINT (10.85194 49.4208)
3   90763 1      POINT (10.99625 49.44171)
4   91054 2 MULTIPOINT ((11.00901 49.59...
5   91097 2 MULTIPOINT ((10.76099 49.59...

你也可以试试mapview::npts()

breweries %>%
  group_by(zipcode) %>%
  summarize() %>%
  rowwise() %>%
  mutate(n = npts(geometry))

#----
   zipcode                                              geometry     n
 * <chr>                                          <GEOMETRY [°]> <dbl>
 1 90562                               POINT (11.15795 49.53495)     1
 2 90614                                POINT (10.85194 49.4208)     1
 3 90763                               POINT (10.99625 49.44171)     1
 4 91054   MULTIPOINT ((11.00901 49.59511), (11.00505 49.60255))     2
 5 91097   MULTIPOINT ((10.76099 49.59044), (10.76954 49.59015))     2

您可以在 sf::st_coordinates() 的基础上构建 - 对于多点来说,它有一个有趣的副作用,即生成一个包含 1) 坐标(duh...)和 2) L1 列的矩阵,识别原始多点的哪个特征坐标是指.

因此请考虑这段代码,它建立在原始 pts 示例的基础上(从原始几何集合转换为多点)。

library(mapview)
library(sf)
library(tidyverse)

pts <- breweries %>% 
  group_by(zipcode) %>% 
  summarise() %>% 
  st_cast("MULTIPOINT")

# get count of coordinates per zipcode
pts$breweries <- st_coordinates(pts) %>% 
  as.data.frame() %>% 
  group_by(L1) %>% # id of the row from original multipoint object
  tally() %>% 
  pull(n)

# zipcodes with less than 3 breweries
pts %>% 
  filter(breweries < 3)

#Simple feature collection with 65 features and 2 fields
#Geometry type: MULTIPOINT
#Dimension:     XY
#Bounding box:  xmin: 9.462785 ymin: 48.90074 xmax: 11.93539 ymax: 50.44162
#Geodetic CRS:  WGS 84
# A tibble: 65 × 3
#   zipcode                                   geometry breweries
# * <chr>                             <MULTIPOINT [°]>     <int>
# 1 90562                        ((11.15795 49.53495))         1
# 2 90614                         ((10.85194 49.4208))         1
# 3 90763                        ((10.99625 49.44171))         1
# 4 91054   ((11.00901 49.59511), (11.00505 49.60255))         2
# 5 91097   ((10.76099 49.59044), (10.76954 49.59015))         2
# 6 91126                         ((10.9281 49.27472))         1
# 7 91207                        ((11.22997 49.55471))         1
# 8 91217                        ((11.42834 49.50683))         1
# 9 91220                         ((11.36851 49.5618))         1
#10 91227                        ((11.30872 49.45008))         1
## … with 55 more rows

请在下面找到另一种可能的方法,与@Jindra Lacko 的方法略有不同。

Reprex

  • 代码
library(mapview)
library(sf)
library(dplyr)

# computes the number of points (i.e. XY pairs = length(x)/2) for each feature and 
# keeps only the features where the number of points < 3
selection <- which(sapply(st_geometry(pts), function(x) length(x)/2) < 3)

# filter rows of pts with the 'selection'
pts[selection,]
  • 输出
#> Simple feature collection with 65 features and 1 field
#> Geometry type: GEOMETRY
#> Dimension:     XY
#> Bounding box:  xmin: 9.462785 ymin: 48.90074 xmax: 11.93539 ymax: 50.44162
#> Geodetic CRS:  WGS 84
#> # A tibble: 65 x 2
#>    zipcode                                              geometry
#>    <chr>                                          <GEOMETRY [°]>
#>  1 90562                               POINT (11.15795 49.53495)
#>  2 90614                                POINT (10.85194 49.4208)
#>  3 90763                               POINT (10.99625 49.44171)
#>  4 91054   MULTIPOINT ((11.00901 49.59511), (11.00505 49.60255))
#>  5 91097   MULTIPOINT ((10.76099 49.59044), (10.76954 49.59015))
#>  6 91126                                POINT (10.9281 49.27472)
#>  7 91207                               POINT (11.22997 49.55471)
#>  8 91217                               POINT (11.42834 49.50683)
#>  9 91220                                POINT (11.36851 49.5618)
#> 10 91227                               POINT (11.30872 49.45008)
#> # ... with 55 more rows

reprex package (v2.0.1)

于 2021-12-29 创建