R - 合并城市名称以近似经纬度坐标
R - Merging city name to approximate lat-long coordinates
我想将城市名称合并为近似坐标。
我有两个数据集。
- 城市经纬度,称为
cities
。
- 观测事件的经纬度,称为
events
。
大多数事件都发生在城市的经纬度之外。
如果经纬度最大为 1 lat
和 lon
与 [=12= 中列出的不同,我想合并到 cities
中的 city
].
data.table
中的nearest
功能好像太简陋了
你会怎么做?使用 maptools
?
示例:
cities <- data.table(city = c("A", "B", "C"),
lat = c(23.4, 43.5, 21.3),
lon = c(100, 98.4, -78.2))
events <- data.table(event = c("X1", "Y1", "B1"),
lat = c(24.4, 42.5, 23.3),
lon = c(101, 100.4, -78.2)))
result <- data.table(event = c("X1", "Y1", "B1"),
lat = c(23.4, 43.5, 21.3),
lon = c(100, 98.4, -78.2),
city = c("A", NA, NA))
> result
event lat lon city
1: X1 23.4 100.0 A
2: Y1 43.5 98.4 <NA>
3: B1 21.3 -78.2 <NA>
方法一:非等值连接
这个非 equi update join 可以解决问题...但这只有在你设置了 1 度的硬限制后才有效。问题是 2 度之间的距离在全球范围内会有所不同...
events[ cities[, `:=`(lat_min = lat - 1, lat_max = lat+1,
lon_min = lon - 1, lon_max = lon + 1) ],
city := i.city,
on = .(lat >= lat_min, lat <= lat_max, lon >= lon_min, lon <= lon_max ) ][]
# event lat lon city
# 1: X1 24.4 101.0 A
# 2: Y1 42.5 100.4 <NA>
# 3: B1 23.3 -78.2 <NA>
方法二:基于绝对距离
如果您想设置事件和城市之间的最大距离,您需要这样的空间解决方案:
#maximum distance between event and city (in metres)
max_dist = 180000
library( sf )
#create simple (point) features of events and cities
cities.sf <- st_as_sf( cities, coords = c("lon", "lat"), crs = 4326 )
events.sf <- st_as_sf( events, coords = c("lon", "lat"), crs = 4326 )
#spatial join
st_join( events.sf, cities.sf, join = st_is_within_distance, dist = max_dist )
# Simple feature collection with 3 features and 2 fields
# geometry type: POINT
# dimension: XY
# bbox: xmin: -78.2 ymin: 23.3 xmax: 101 ymax: 42.5
# CRS: EPSG:4326
# event city geometry
# 1 X1 A POINT (101 24.4)
# 2 Y1 <NA> POINT (100.4 42.5)
# 3 B1 <NA> POINT (-78.2 23.3)
我想将城市名称合并为近似坐标。
我有两个数据集。
- 城市经纬度,称为
cities
。 - 观测事件的经纬度,称为
events
。
大多数事件都发生在城市的经纬度之外。
如果经纬度最大为 1 lat
和 lon
与 [=12= 中列出的不同,我想合并到 cities
中的 city
].
data.table
中的nearest
功能好像太简陋了
你会怎么做?使用 maptools
?
示例:
cities <- data.table(city = c("A", "B", "C"),
lat = c(23.4, 43.5, 21.3),
lon = c(100, 98.4, -78.2))
events <- data.table(event = c("X1", "Y1", "B1"),
lat = c(24.4, 42.5, 23.3),
lon = c(101, 100.4, -78.2)))
result <- data.table(event = c("X1", "Y1", "B1"),
lat = c(23.4, 43.5, 21.3),
lon = c(100, 98.4, -78.2),
city = c("A", NA, NA))
> result
event lat lon city
1: X1 23.4 100.0 A
2: Y1 43.5 98.4 <NA>
3: B1 21.3 -78.2 <NA>
方法一:非等值连接
这个非 equi update join 可以解决问题...但这只有在你设置了 1 度的硬限制后才有效。问题是 2 度之间的距离在全球范围内会有所不同...
events[ cities[, `:=`(lat_min = lat - 1, lat_max = lat+1,
lon_min = lon - 1, lon_max = lon + 1) ],
city := i.city,
on = .(lat >= lat_min, lat <= lat_max, lon >= lon_min, lon <= lon_max ) ][]
# event lat lon city
# 1: X1 24.4 101.0 A
# 2: Y1 42.5 100.4 <NA>
# 3: B1 23.3 -78.2 <NA>
方法二:基于绝对距离
如果您想设置事件和城市之间的最大距离,您需要这样的空间解决方案:
#maximum distance between event and city (in metres)
max_dist = 180000
library( sf )
#create simple (point) features of events and cities
cities.sf <- st_as_sf( cities, coords = c("lon", "lat"), crs = 4326 )
events.sf <- st_as_sf( events, coords = c("lon", "lat"), crs = 4326 )
#spatial join
st_join( events.sf, cities.sf, join = st_is_within_distance, dist = max_dist )
# Simple feature collection with 3 features and 2 fields
# geometry type: POINT
# dimension: XY
# bbox: xmin: -78.2 ymin: 23.3 xmax: 101 ymax: 42.5
# CRS: EPSG:4326
# event city geometry
# 1 X1 A POINT (101 24.4)
# 2 Y1 <NA> POINT (100.4 42.5)
# 3 B1 <NA> POINT (-78.2 23.3)