r - 通过 shapefile 区域连接数据框坐标,也就是通过位置连接属性
r - Join data frame coordinates by shapefile regions aka Join Attributes by Location
我有一个大数据集,作为 data.frame
在 R 中加载。它包含与坐标点 (lat/lon) 相关的观测值。
我还有北美的shape文件
在我的数据框中的空列(NA
已填充)中,标记为 BCR
,我想根据 shapefile
插入每个坐标落入的区域名称。
我知道怎么做 QGIS
使用 Vector
> Data Management Tools
> Join Attributes by Location
shapefile可以通过点击HERE下载。
我的数据现在看起来像这样(样本):
LATITUDE LONGITUDE Year EFF n St PJ day BCR
50.406752 -104.613 2009 1 0 SK 90 2 NA
50.40678 -104.61256 2009 2 0 SK 120 3 NA
50.40678 -104.61256 2009 2 1 SK 136 2 NA
50.40678 -104.61256 2009 3 2 SK 149 4 NA
43.0026385 -79.2900467 2009 2 0 ON 112 3 NA
43.0026385 -79.2900467 2009 2 1 ON 122 3 NA
但我希望它看起来像这样:
LATITUDE LONGITUDE Year EFF n St PJ day BCR
50.406752 -104.613 2009 1 0 SK 90 2 Prairie Potholes
50.40678 -104.61256 2009 2 0 SK 120 3 Prairie Potholes
50.40678 -104.61256 2009 2 1 SK 136 2 Prairie Potholes
50.40678 -104.61256 2009 3 2 SK 149 4 Prairie Potholes
43.0026385 -79.2900467 2009 2 0 ON 112 3 Lower Great Lakes/St.Lawrence Plain
43.0026385 -79.2900467 2009 2 1 ON 122 3 Lower Great Lakes/St.Lawrence Plain
请注意,BCR 列现在填充了适当的 BCR 区域名称。
到目前为止,我的代码只是导入和格式化数据和 shapefile:
library(rgdal)
library(proj4)
library(sp)
library(raster)
# PFW data, full 2.5m observations
df = read.csv("MyData.csv")
# Clearning out empty coordinate data
pfw = df[(df$LATITUDE != 0) & (df$LONGITUDE != 0) & (!is.na(df$LATITUDE)) & (!is.na(df$LATITUDE)),]
# Creating a new column to be filled with associated Bird Conservation Regions
pfw["BCR"] = NA
# Making a duplicate data frame to conserve data
toSPDF = pfw
# Ensuring spatial formatting
#coordinates(toSPDF) = ~LATITUDE + LONGITUDE
SPDF <- SpatialPointsDataFrame(toSPDF[,c("LONGITUDE", "LATITUDE"),],
toSPDF,
proj4string = CRS("+init=epsg:4326"))
# BCR shape file, no state borders
shp = shapefile("C:/Users/User1/Desktop/BCR/BCR_Terrestrial_master_International.shx")
spPoly = spTransform(shp, CRS("+init=epsg:4326 +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"))
# Check
isTRUE(proj4string(spPoly) == proj4string(SPDF))
# Trying to join attributes by location
#try1 = point.in.polygon(spPoly, SPDF) # Sounds good doesn't work
#a.data <- over(SPDF, spPoly[,"BCRNAME"]) # Error: cannot allocate vector of size 204.7 Mb
我想您想对点和多边形进行空间查询。也就是将多边形属性赋予对应的点。你可以这样做:
示例数据
library(terra)
f <- system.file("ex/lux.shp", package="terra")
polygons <- vect(f)
points <- spatSample(v, 10)
解决方案
e <- extract(polygons, points)
e
# id.y ID_1 NAME_1 ID_2 NAME_2 AREA POP
#1 1 3 Luxembourg 9 Esch-sur-Alzette 251 176820
#2 2 3 Luxembourg 9 Esch-sur-Alzette 251 176820
#3 3 2 Grevenmacher 6 Echternach 188 18899
#4 4 1 Diekirch 2 Diekirch 218 32543
#5 5 3 Luxembourg 9 Esch-sur-Alzette 251 176820
#6 6 1 Diekirch 4 Vianden 76 5163
#7 7 3 Luxembourg 11 Mersch 233 32112
#8 8 2 Grevenmacher 7 Remich 129 22366
#9 9 1 Diekirch 3 Redange 259 18664
#10 10 3 Luxembourg 9 Esch-sur-Alzette 251 176820
对于较旧的空间包,您可以使用 raster::extract
或 sp::over
。
示例数据:
library(raster)
pols <- shapefile(system.file("external/lux.shp", package="raster"))
set.seed(20180121)
pts <- data.frame(coordinates(spsample(pols, 5, 'random')), name=letters[1:5])
plot(pols); points(pts)
解决方案:
e <- extract(pols, pts[, c('x', 'y')])
pts$BCR <- e$NAME_2
pts
# x y name BCR
#1 6.009390 49.98333 a Wiltz
#2 5.766407 49.85188 b Redange
#3 6.268405 49.62585 c Luxembourg
#4 6.123015 49.56486 d Luxembourg
#5 5.911638 49.53957 e Esch-sur-Alzette
我有一个大数据集,作为 data.frame
在 R 中加载。它包含与坐标点 (lat/lon) 相关的观测值。
我还有北美的shape文件
在我的数据框中的空列(NA
已填充)中,标记为 BCR
,我想根据 shapefile
插入每个坐标落入的区域名称。
我知道怎么做 QGIS
使用 Vector
> Data Management Tools
> Join Attributes by Location
shapefile可以通过点击HERE下载。
我的数据现在看起来像这样(样本):
LATITUDE LONGITUDE Year EFF n St PJ day BCR
50.406752 -104.613 2009 1 0 SK 90 2 NA
50.40678 -104.61256 2009 2 0 SK 120 3 NA
50.40678 -104.61256 2009 2 1 SK 136 2 NA
50.40678 -104.61256 2009 3 2 SK 149 4 NA
43.0026385 -79.2900467 2009 2 0 ON 112 3 NA
43.0026385 -79.2900467 2009 2 1 ON 122 3 NA
但我希望它看起来像这样:
LATITUDE LONGITUDE Year EFF n St PJ day BCR
50.406752 -104.613 2009 1 0 SK 90 2 Prairie Potholes
50.40678 -104.61256 2009 2 0 SK 120 3 Prairie Potholes
50.40678 -104.61256 2009 2 1 SK 136 2 Prairie Potholes
50.40678 -104.61256 2009 3 2 SK 149 4 Prairie Potholes
43.0026385 -79.2900467 2009 2 0 ON 112 3 Lower Great Lakes/St.Lawrence Plain
43.0026385 -79.2900467 2009 2 1 ON 122 3 Lower Great Lakes/St.Lawrence Plain
请注意,BCR 列现在填充了适当的 BCR 区域名称。
到目前为止,我的代码只是导入和格式化数据和 shapefile:
library(rgdal)
library(proj4)
library(sp)
library(raster)
# PFW data, full 2.5m observations
df = read.csv("MyData.csv")
# Clearning out empty coordinate data
pfw = df[(df$LATITUDE != 0) & (df$LONGITUDE != 0) & (!is.na(df$LATITUDE)) & (!is.na(df$LATITUDE)),]
# Creating a new column to be filled with associated Bird Conservation Regions
pfw["BCR"] = NA
# Making a duplicate data frame to conserve data
toSPDF = pfw
# Ensuring spatial formatting
#coordinates(toSPDF) = ~LATITUDE + LONGITUDE
SPDF <- SpatialPointsDataFrame(toSPDF[,c("LONGITUDE", "LATITUDE"),],
toSPDF,
proj4string = CRS("+init=epsg:4326"))
# BCR shape file, no state borders
shp = shapefile("C:/Users/User1/Desktop/BCR/BCR_Terrestrial_master_International.shx")
spPoly = spTransform(shp, CRS("+init=epsg:4326 +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"))
# Check
isTRUE(proj4string(spPoly) == proj4string(SPDF))
# Trying to join attributes by location
#try1 = point.in.polygon(spPoly, SPDF) # Sounds good doesn't work
#a.data <- over(SPDF, spPoly[,"BCRNAME"]) # Error: cannot allocate vector of size 204.7 Mb
我想您想对点和多边形进行空间查询。也就是将多边形属性赋予对应的点。你可以这样做:
示例数据
library(terra)
f <- system.file("ex/lux.shp", package="terra")
polygons <- vect(f)
points <- spatSample(v, 10)
解决方案
e <- extract(polygons, points)
e
# id.y ID_1 NAME_1 ID_2 NAME_2 AREA POP
#1 1 3 Luxembourg 9 Esch-sur-Alzette 251 176820
#2 2 3 Luxembourg 9 Esch-sur-Alzette 251 176820
#3 3 2 Grevenmacher 6 Echternach 188 18899
#4 4 1 Diekirch 2 Diekirch 218 32543
#5 5 3 Luxembourg 9 Esch-sur-Alzette 251 176820
#6 6 1 Diekirch 4 Vianden 76 5163
#7 7 3 Luxembourg 11 Mersch 233 32112
#8 8 2 Grevenmacher 7 Remich 129 22366
#9 9 1 Diekirch 3 Redange 259 18664
#10 10 3 Luxembourg 9 Esch-sur-Alzette 251 176820
对于较旧的空间包,您可以使用 raster::extract
或 sp::over
。
示例数据:
library(raster)
pols <- shapefile(system.file("external/lux.shp", package="raster"))
set.seed(20180121)
pts <- data.frame(coordinates(spsample(pols, 5, 'random')), name=letters[1:5])
plot(pols); points(pts)
解决方案:
e <- extract(pols, pts[, c('x', 'y')])
pts$BCR <- e$NAME_2
pts
# x y name BCR
#1 6.009390 49.98333 a Wiltz
#2 5.766407 49.85188 b Redange
#3 6.268405 49.62585 c Luxembourg
#4 6.123015 49.56486 d Luxembourg
#5 5.911638 49.53957 e Esch-sur-Alzette