如何从人口普查形状文件(邮政编码级别)中删除所有小岛?
how to remove all the small islands from the Census Shapefile (zip code level)?
我已经下载了邮政编码级别的人口普查 shapefile,cb_2017_us_zcta510_500k.shp
(https://www.census.gov/geo/maps-data/data/cbf/cbf_zcta.html)
我还下载了允许我添加相应 STATE
变量的映射文件 (https://www.census.gov/geo/maps-data/data/zcta_rel_download.html)
我将两者合并,得到:
library(sf)
library(dplyr)
big_df
Simple feature collection with 44434 features and 2 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -176.6847 ymin: -14.37374 xmax: 145.8304 ymax: 71.34122
epsg (SRID): 4269
proj4string: +proj=longlat +datum=NAD83 +no_defs
First 10 features:
ZCTA5CE10 STATE geometry
1 35442 1 MULTIPOLYGON (((-88.25262 3...
2 35442 1 MULTIPOLYGON (((-88.25262 3...
3 35442 1 MULTIPOLYGON (((-88.25262 3...
现在,我尝试过滤所有的小岛和阿拉斯加:
remove_list <- c("02", "15", "72", "66", "78", "60", "69",
"64", "68", "70", "74", "81", "84", "86", "87", "89", "71", "76",
"95", "79")
big_df %>% filter(!STATE %in% map(remove_list, as.integer)) %>%
tm_shape(.) + tm_polygons('pt_count',palette = "Reds",
style = "quantile", n = 10,
title = "counts")
但我还是得到了一些小岛。
我在这里错过了什么?
谢谢!
这是获取美国大陆几何(国家概况)的方法:
library(raster)
library(sf)
library(dplyr)
us = getData('GADM', country='USA', level=0) %>%
st_as_sf() %>%
st_cast("POLYGON") %>%
mutate(area = st_area(.)) %>%
arrange(desc(area)) %>%
slice(1) # mainland US should be the largest
然后您可以使用它来 运行 st_intersection(big_df, us)
仅提取 us
中 big_df
的部分。请注意,首先在 us
周围创建 st_buffer
或 st_convex_hull
可能会有所回报,以确保您的 big_df
不会在其边界的某处被剪裁。
我已经下载了邮政编码级别的人口普查 shapefile,cb_2017_us_zcta510_500k.shp
(https://www.census.gov/geo/maps-data/data/cbf/cbf_zcta.html)
我还下载了允许我添加相应 STATE
变量的映射文件 (https://www.census.gov/geo/maps-data/data/zcta_rel_download.html)
我将两者合并,得到:
library(sf)
library(dplyr)
big_df
Simple feature collection with 44434 features and 2 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -176.6847 ymin: -14.37374 xmax: 145.8304 ymax: 71.34122
epsg (SRID): 4269
proj4string: +proj=longlat +datum=NAD83 +no_defs
First 10 features:
ZCTA5CE10 STATE geometry
1 35442 1 MULTIPOLYGON (((-88.25262 3...
2 35442 1 MULTIPOLYGON (((-88.25262 3...
3 35442 1 MULTIPOLYGON (((-88.25262 3...
现在,我尝试过滤所有的小岛和阿拉斯加:
remove_list <- c("02", "15", "72", "66", "78", "60", "69",
"64", "68", "70", "74", "81", "84", "86", "87", "89", "71", "76",
"95", "79")
big_df %>% filter(!STATE %in% map(remove_list, as.integer)) %>%
tm_shape(.) + tm_polygons('pt_count',palette = "Reds",
style = "quantile", n = 10,
title = "counts")
但我还是得到了一些小岛。
我在这里错过了什么? 谢谢!
这是获取美国大陆几何(国家概况)的方法:
library(raster)
library(sf)
library(dplyr)
us = getData('GADM', country='USA', level=0) %>%
st_as_sf() %>%
st_cast("POLYGON") %>%
mutate(area = st_area(.)) %>%
arrange(desc(area)) %>%
slice(1) # mainland US should be the largest
然后您可以使用它来 运行 st_intersection(big_df, us)
仅提取 us
中 big_df
的部分。请注意,首先在 us
周围创建 st_buffer
或 st_convex_hull
可能会有所回报,以确保您的 big_df
不会在其边界的某处被剪裁。