如何从 r 地理编码中获取国家名称
how get country name from r geocode
我使用 ggmap
库从地址列表中获取国家名称。
但它没有按预期工作(至少 as described here)
r<-ggmap::geocode(c('harvard university', 'the vatican'), output = 'more')
glimpse(r)
结果:
Rows: 2
Columns: 9
$ lon <dbl> -71.117, 12.453
$ lat <dbl> 42.377, 41.903
$ type <chr> "establishment", "country"
$ loctype <chr> "geometric_center", "approximate"
$ address <chr> "cambridge, ma, usa", "00120, vatican city"
$ north <dbl> 42.378, 41.907
$ south <dbl> 42.376, 41.900
$ east <dbl> -71.115, 12.458
$ west <dbl> -71.118, 12.446
在这种情况下如何获取国家名称?
更新:
如果我提供一个 output='all'
参数,它 returns 一个很长的嵌套列表,其中 address_components
中有一个国家名称。检索它的最有效方法是什么?
代码:
r<-ggmap::geocode(c('harvard university', 'the vatican'), output = 'all')
glimpse(r)
结果:
> str(r )
tibble [1 × 2] (S3: tbl_df/tbl/data.frame)
$ results:List of 1
..$ :List of 5
.. ..$ address_components:List of 3
.. .. ..$ :List of 3
.. .. .. ..$ long_name : chr "New York"
.. .. .. ..$ short_name: chr "New York"
.. .. .. ..$ types :List of 2
.. .. .. .. ..$ : chr "locality"
.. .. .. .. ..$ : chr "political"
.. .. ..$ :List of 3
.. .. .. ..$ long_name : chr "New York"
.. .. .. ..$ short_name: chr "NY"
.. .. .. ..$ types :List of 2
.. .. .. .. ..$ : chr "administrative_area_level_1"
.. .. .. .. ..$ : chr "political"
.. .. ..$ :List of 3
.. .. .. ..$ long_name : chr "United States"
.. .. .. ..$ short_name: chr "US"
.. .. .. ..$ types :List of 2
.. .. .. .. ..$ : chr "country"
.. .. .. .. ..$ : chr "political"
.. ..$ formatted_address : chr "New York, NY, USA"
.. ..$ geometry :List of 4
.. .. ..$ bounds :List of 2
.. .. .. ..$ northeast:List of 2
.. .. .. .. ..$ lat: num 40.9
.. .. .. .. ..$ lng: num -73.7
.. .. .. ..$ southwest:List of 2
.. .. .. .. ..$ lat: num 40.5
.. .. .. .. ..$ lng: num -74.3
.. .. ..$ location :List of 2
.. .. .. ..$ lat: num 40.7
.. .. .. ..$ lng: num -74
.. .. ..$ location_type: chr "APPROXIMATE"
.. .. ..$ viewport :List of 2
.. .. .. ..$ northeast:List of 2
.. .. .. .. ..$ lat: num 40.9
.. .. .. .. ..$ lng: num -73.7
.. .. .. ..$ southwest:List of 2
.. .. .. .. ..$ lat: num 40.5
.. .. .. .. ..$ lng: num -74.3
.. ..$ place_id : chr "ChIJOwg_06VPwokRYv534QaPC8g"
.. ..$ types :List of 2
.. .. ..$ : chr "locality"
.. .. ..$ : chr "political"
$ status : chr "OK"
好的,这是一种非常糟糕的做法,但我想出了一种非常丑陋的方式(但它有效):
loc2country <- function(loc) {
latlon <-
ggmap::geocode(location = c(loc),
output = "latlon",
source = "google") %>% as_tibble()
revgeo::revgeo(
longitude = latlon$lon,
latitude = latlon$lat,
provider = 'google',
API = api_key,
output = 'hash',
item = 'country'
)$country
}
loc2country('Vienna')
returns:
[1] "Austria"
我认为 output = 'more'
论点更容易提取。最后一项(可能)总是国家。
r<-ggmap::geocode(c('harvard university', 'the vatican'), output = 'more') %>%
mutate(
country = stringr::str_extract(address, "(?<=, )[^,]*$") # everything after last comma
)
# # A tibble: 2 x 10
# lon lat type loctype address north south east west country
# <dbl> <dbl> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
# 1 -71.1 42.4 establishment geometric_center cambridge, ma, usa 42.4 42.4 -71.1 -71.1 usa
# 2 12.5 41.9 country approximate 00120, vatican city 41.9 41.9 12.5 12.4 vatican city
或者,您可以使用 purrr::pluck
从深层嵌套列表中提取 'country',尽管您可能需要做一些工作才能在 'address_components' 中找到合适的索引,因为它看起来有变化。
library(ggmap)
library(purrr)
r<-ggmap::geocode(c('harvard university', 'the vatican'), output = 'all')
# base r
r[[1]][[1]][[1]]$address_components[[3]]$long_name # USA
r[[2]][[1]][[1]]$address_components[[1]]$long_name # Vatican City
# purrr
purrr::pluck(r, 1, 1, 1, "address_components", 3, "long_name")
purrr::pluck(r, 2, 1, 1, "address_components", 1, "long_name")
# Result:
# usa
# vatican city
编辑:这是一种更可靠的提取国家/地区名称的方法。
sapply(r, function(x) {
# Simplify
x_reduc <- x$results[[1]]$address_component
# Loop for 'country' component
for (i in seq_len(length(x_reduc[[1]]))) {
component_type <- x_reduc[[i]]$types[[1]]
if (component_type == 'country') return(x_reduc[[i]]$long_name)
}
})
我使用 ggmap
库从地址列表中获取国家名称。
但它没有按预期工作(至少 as described here)
r<-ggmap::geocode(c('harvard university', 'the vatican'), output = 'more')
glimpse(r)
结果:
Rows: 2
Columns: 9
$ lon <dbl> -71.117, 12.453
$ lat <dbl> 42.377, 41.903
$ type <chr> "establishment", "country"
$ loctype <chr> "geometric_center", "approximate"
$ address <chr> "cambridge, ma, usa", "00120, vatican city"
$ north <dbl> 42.378, 41.907
$ south <dbl> 42.376, 41.900
$ east <dbl> -71.115, 12.458
$ west <dbl> -71.118, 12.446
在这种情况下如何获取国家名称?
更新:
如果我提供一个 output='all'
参数,它 returns 一个很长的嵌套列表,其中 address_components
中有一个国家名称。检索它的最有效方法是什么?
代码:
r<-ggmap::geocode(c('harvard university', 'the vatican'), output = 'all')
glimpse(r)
结果:
> str(r )
tibble [1 × 2] (S3: tbl_df/tbl/data.frame)
$ results:List of 1
..$ :List of 5
.. ..$ address_components:List of 3
.. .. ..$ :List of 3
.. .. .. ..$ long_name : chr "New York"
.. .. .. ..$ short_name: chr "New York"
.. .. .. ..$ types :List of 2
.. .. .. .. ..$ : chr "locality"
.. .. .. .. ..$ : chr "political"
.. .. ..$ :List of 3
.. .. .. ..$ long_name : chr "New York"
.. .. .. ..$ short_name: chr "NY"
.. .. .. ..$ types :List of 2
.. .. .. .. ..$ : chr "administrative_area_level_1"
.. .. .. .. ..$ : chr "political"
.. .. ..$ :List of 3
.. .. .. ..$ long_name : chr "United States"
.. .. .. ..$ short_name: chr "US"
.. .. .. ..$ types :List of 2
.. .. .. .. ..$ : chr "country"
.. .. .. .. ..$ : chr "political"
.. ..$ formatted_address : chr "New York, NY, USA"
.. ..$ geometry :List of 4
.. .. ..$ bounds :List of 2
.. .. .. ..$ northeast:List of 2
.. .. .. .. ..$ lat: num 40.9
.. .. .. .. ..$ lng: num -73.7
.. .. .. ..$ southwest:List of 2
.. .. .. .. ..$ lat: num 40.5
.. .. .. .. ..$ lng: num -74.3
.. .. ..$ location :List of 2
.. .. .. ..$ lat: num 40.7
.. .. .. ..$ lng: num -74
.. .. ..$ location_type: chr "APPROXIMATE"
.. .. ..$ viewport :List of 2
.. .. .. ..$ northeast:List of 2
.. .. .. .. ..$ lat: num 40.9
.. .. .. .. ..$ lng: num -73.7
.. .. .. ..$ southwest:List of 2
.. .. .. .. ..$ lat: num 40.5
.. .. .. .. ..$ lng: num -74.3
.. ..$ place_id : chr "ChIJOwg_06VPwokRYv534QaPC8g"
.. ..$ types :List of 2
.. .. ..$ : chr "locality"
.. .. ..$ : chr "political"
$ status : chr "OK"
好的,这是一种非常糟糕的做法,但我想出了一种非常丑陋的方式(但它有效):
loc2country <- function(loc) {
latlon <-
ggmap::geocode(location = c(loc),
output = "latlon",
source = "google") %>% as_tibble()
revgeo::revgeo(
longitude = latlon$lon,
latitude = latlon$lat,
provider = 'google',
API = api_key,
output = 'hash',
item = 'country'
)$country
}
loc2country('Vienna')
returns:
[1] "Austria"
我认为 output = 'more'
论点更容易提取。最后一项(可能)总是国家。
r<-ggmap::geocode(c('harvard university', 'the vatican'), output = 'more') %>%
mutate(
country = stringr::str_extract(address, "(?<=, )[^,]*$") # everything after last comma
)
# # A tibble: 2 x 10
# lon lat type loctype address north south east west country
# <dbl> <dbl> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
# 1 -71.1 42.4 establishment geometric_center cambridge, ma, usa 42.4 42.4 -71.1 -71.1 usa
# 2 12.5 41.9 country approximate 00120, vatican city 41.9 41.9 12.5 12.4 vatican city
或者,您可以使用 purrr::pluck
从深层嵌套列表中提取 'country',尽管您可能需要做一些工作才能在 'address_components' 中找到合适的索引,因为它看起来有变化。
library(ggmap)
library(purrr)
r<-ggmap::geocode(c('harvard university', 'the vatican'), output = 'all')
# base r
r[[1]][[1]][[1]]$address_components[[3]]$long_name # USA
r[[2]][[1]][[1]]$address_components[[1]]$long_name # Vatican City
# purrr
purrr::pluck(r, 1, 1, 1, "address_components", 3, "long_name")
purrr::pluck(r, 2, 1, 1, "address_components", 1, "long_name")
# Result:
# usa
# vatican city
编辑:这是一种更可靠的提取国家/地区名称的方法。
sapply(r, function(x) {
# Simplify
x_reduc <- x$results[[1]]$address_component
# Loop for 'country' component
for (i in seq_len(length(x_reduc[[1]]))) {
component_type <- x_reduc[[i]]$types[[1]]
if (component_type == 'country') return(x_reduc[[i]]$long_name)
}
})