计算 sfc_multipoint 对象中的点数
Count number of points in sfc_multipoint object
我想按 sfc_multipoint
的大小过滤 sfc_multipoint
数据,即每行中包含的 points/coordinate 对的数量。假设我只想保留以下数据集中不超过 3 个点的行,基于 mapview::breweries
:
library(mapview)
library(sf)
library(tidyverse)
pts <- breweries %>%
group_by(zipcode) %>%
summarise()
为此,我需要知道每个 sfc_multipoint
中的点数。理论上我可以通过 as("spatial")
导出坐标并计算行数,但这不仅难看,而且对于任何实际大小的数据集来说也不可行。有什么建议吗?
您可以尝试 dplyr::count()
或 summarize(n = n())
读取给定 zipcode
中的行数,但 breweries
数据集似乎有一些重复项,因此这可能误导。
breweries %>%
count(zipcode)
#----------
zipcode n geometry
1 90562 1 POINT (11.15795 49.53495)
2 90614 1 POINT (10.85194 49.4208)
3 90763 1 POINT (10.99625 49.44171)
4 91054 3 MULTIPOINT ((11.00901 49.59...
5 91097 2 MULTIPOINT ((10.76099 49.59...
或者只有唯一点(注意 91054
中的变化)
breweries %>%
distinct(zipcode, geometry) %>%
count(zipcode)
#-----
zipcode n geometry
1 90562 1 POINT (11.15795 49.53495)
2 90614 1 POINT (10.85194 49.4208)
3 90763 1 POINT (10.99625 49.44171)
4 91054 2 MULTIPOINT ((11.00901 49.59...
5 91097 2 MULTIPOINT ((10.76099 49.59...
你也可以试试mapview::npts()
breweries %>%
group_by(zipcode) %>%
summarize() %>%
rowwise() %>%
mutate(n = npts(geometry))
#----
zipcode geometry n
* <chr> <GEOMETRY [°]> <dbl>
1 90562 POINT (11.15795 49.53495) 1
2 90614 POINT (10.85194 49.4208) 1
3 90763 POINT (10.99625 49.44171) 1
4 91054 MULTIPOINT ((11.00901 49.59511), (11.00505 49.60255)) 2
5 91097 MULTIPOINT ((10.76099 49.59044), (10.76954 49.59015)) 2
您可以在 sf::st_coordinates()
的基础上构建 - 对于多点来说,它有一个有趣的副作用,即生成一个包含 1) 坐标(duh...)和 2) L1 列的矩阵,识别原始多点的哪个特征坐标是指.
因此请考虑这段代码,它建立在原始 pts 示例的基础上(从原始几何集合转换为多点)。
library(mapview)
library(sf)
library(tidyverse)
pts <- breweries %>%
group_by(zipcode) %>%
summarise() %>%
st_cast("MULTIPOINT")
# get count of coordinates per zipcode
pts$breweries <- st_coordinates(pts) %>%
as.data.frame() %>%
group_by(L1) %>% # id of the row from original multipoint object
tally() %>%
pull(n)
# zipcodes with less than 3 breweries
pts %>%
filter(breweries < 3)
#Simple feature collection with 65 features and 2 fields
#Geometry type: MULTIPOINT
#Dimension: XY
#Bounding box: xmin: 9.462785 ymin: 48.90074 xmax: 11.93539 ymax: 50.44162
#Geodetic CRS: WGS 84
# A tibble: 65 × 3
# zipcode geometry breweries
# * <chr> <MULTIPOINT [°]> <int>
# 1 90562 ((11.15795 49.53495)) 1
# 2 90614 ((10.85194 49.4208)) 1
# 3 90763 ((10.99625 49.44171)) 1
# 4 91054 ((11.00901 49.59511), (11.00505 49.60255)) 2
# 5 91097 ((10.76099 49.59044), (10.76954 49.59015)) 2
# 6 91126 ((10.9281 49.27472)) 1
# 7 91207 ((11.22997 49.55471)) 1
# 8 91217 ((11.42834 49.50683)) 1
# 9 91220 ((11.36851 49.5618)) 1
#10 91227 ((11.30872 49.45008)) 1
## … with 55 more rows
请在下面找到另一种可能的方法,与@Jindra Lacko 的方法略有不同。
Reprex
- 代码
library(mapview)
library(sf)
library(dplyr)
# computes the number of points (i.e. XY pairs = length(x)/2) for each feature and
# keeps only the features where the number of points < 3
selection <- which(sapply(st_geometry(pts), function(x) length(x)/2) < 3)
# filter rows of pts with the 'selection'
pts[selection,]
- 输出
#> Simple feature collection with 65 features and 1 field
#> Geometry type: GEOMETRY
#> Dimension: XY
#> Bounding box: xmin: 9.462785 ymin: 48.90074 xmax: 11.93539 ymax: 50.44162
#> Geodetic CRS: WGS 84
#> # A tibble: 65 x 2
#> zipcode geometry
#> <chr> <GEOMETRY [°]>
#> 1 90562 POINT (11.15795 49.53495)
#> 2 90614 POINT (10.85194 49.4208)
#> 3 90763 POINT (10.99625 49.44171)
#> 4 91054 MULTIPOINT ((11.00901 49.59511), (11.00505 49.60255))
#> 5 91097 MULTIPOINT ((10.76099 49.59044), (10.76954 49.59015))
#> 6 91126 POINT (10.9281 49.27472)
#> 7 91207 POINT (11.22997 49.55471)
#> 8 91217 POINT (11.42834 49.50683)
#> 9 91220 POINT (11.36851 49.5618)
#> 10 91227 POINT (11.30872 49.45008)
#> # ... with 55 more rows
由 reprex package (v2.0.1)
于 2021-12-29 创建
我想按 sfc_multipoint
的大小过滤 sfc_multipoint
数据,即每行中包含的 points/coordinate 对的数量。假设我只想保留以下数据集中不超过 3 个点的行,基于 mapview::breweries
:
library(mapview)
library(sf)
library(tidyverse)
pts <- breweries %>%
group_by(zipcode) %>%
summarise()
为此,我需要知道每个 sfc_multipoint
中的点数。理论上我可以通过 as("spatial")
导出坐标并计算行数,但这不仅难看,而且对于任何实际大小的数据集来说也不可行。有什么建议吗?
您可以尝试 dplyr::count()
或 summarize(n = n())
读取给定 zipcode
中的行数,但 breweries
数据集似乎有一些重复项,因此这可能误导。
breweries %>%
count(zipcode)
#----------
zipcode n geometry
1 90562 1 POINT (11.15795 49.53495)
2 90614 1 POINT (10.85194 49.4208)
3 90763 1 POINT (10.99625 49.44171)
4 91054 3 MULTIPOINT ((11.00901 49.59...
5 91097 2 MULTIPOINT ((10.76099 49.59...
或者只有唯一点(注意 91054
中的变化)
breweries %>%
distinct(zipcode, geometry) %>%
count(zipcode)
#-----
zipcode n geometry
1 90562 1 POINT (11.15795 49.53495)
2 90614 1 POINT (10.85194 49.4208)
3 90763 1 POINT (10.99625 49.44171)
4 91054 2 MULTIPOINT ((11.00901 49.59...
5 91097 2 MULTIPOINT ((10.76099 49.59...
你也可以试试mapview::npts()
breweries %>%
group_by(zipcode) %>%
summarize() %>%
rowwise() %>%
mutate(n = npts(geometry))
#----
zipcode geometry n
* <chr> <GEOMETRY [°]> <dbl>
1 90562 POINT (11.15795 49.53495) 1
2 90614 POINT (10.85194 49.4208) 1
3 90763 POINT (10.99625 49.44171) 1
4 91054 MULTIPOINT ((11.00901 49.59511), (11.00505 49.60255)) 2
5 91097 MULTIPOINT ((10.76099 49.59044), (10.76954 49.59015)) 2
您可以在 sf::st_coordinates()
的基础上构建 - 对于多点来说,它有一个有趣的副作用,即生成一个包含 1) 坐标(duh...)和 2) L1 列的矩阵,识别原始多点的哪个特征坐标是指.
因此请考虑这段代码,它建立在原始 pts 示例的基础上(从原始几何集合转换为多点)。
library(mapview)
library(sf)
library(tidyverse)
pts <- breweries %>%
group_by(zipcode) %>%
summarise() %>%
st_cast("MULTIPOINT")
# get count of coordinates per zipcode
pts$breweries <- st_coordinates(pts) %>%
as.data.frame() %>%
group_by(L1) %>% # id of the row from original multipoint object
tally() %>%
pull(n)
# zipcodes with less than 3 breweries
pts %>%
filter(breweries < 3)
#Simple feature collection with 65 features and 2 fields
#Geometry type: MULTIPOINT
#Dimension: XY
#Bounding box: xmin: 9.462785 ymin: 48.90074 xmax: 11.93539 ymax: 50.44162
#Geodetic CRS: WGS 84
# A tibble: 65 × 3
# zipcode geometry breweries
# * <chr> <MULTIPOINT [°]> <int>
# 1 90562 ((11.15795 49.53495)) 1
# 2 90614 ((10.85194 49.4208)) 1
# 3 90763 ((10.99625 49.44171)) 1
# 4 91054 ((11.00901 49.59511), (11.00505 49.60255)) 2
# 5 91097 ((10.76099 49.59044), (10.76954 49.59015)) 2
# 6 91126 ((10.9281 49.27472)) 1
# 7 91207 ((11.22997 49.55471)) 1
# 8 91217 ((11.42834 49.50683)) 1
# 9 91220 ((11.36851 49.5618)) 1
#10 91227 ((11.30872 49.45008)) 1
## … with 55 more rows
请在下面找到另一种可能的方法,与@Jindra Lacko 的方法略有不同。
Reprex
- 代码
library(mapview)
library(sf)
library(dplyr)
# computes the number of points (i.e. XY pairs = length(x)/2) for each feature and
# keeps only the features where the number of points < 3
selection <- which(sapply(st_geometry(pts), function(x) length(x)/2) < 3)
# filter rows of pts with the 'selection'
pts[selection,]
- 输出
#> Simple feature collection with 65 features and 1 field
#> Geometry type: GEOMETRY
#> Dimension: XY
#> Bounding box: xmin: 9.462785 ymin: 48.90074 xmax: 11.93539 ymax: 50.44162
#> Geodetic CRS: WGS 84
#> # A tibble: 65 x 2
#> zipcode geometry
#> <chr> <GEOMETRY [°]>
#> 1 90562 POINT (11.15795 49.53495)
#> 2 90614 POINT (10.85194 49.4208)
#> 3 90763 POINT (10.99625 49.44171)
#> 4 91054 MULTIPOINT ((11.00901 49.59511), (11.00505 49.60255))
#> 5 91097 MULTIPOINT ((10.76099 49.59044), (10.76954 49.59015))
#> 6 91126 POINT (10.9281 49.27472)
#> 7 91207 POINT (11.22997 49.55471)
#> 8 91217 POINT (11.42834 49.50683)
#> 9 91220 POINT (11.36851 49.5618)
#> 10 91227 POINT (11.30872 49.45008)
#> # ... with 55 more rows
由 reprex package (v2.0.1)
于 2021-12-29 创建