如何从 r 地理编码中获取国家名称

how get country name from r geocode

我使用 ggmap 库从地址列表中获取国家名称。

但它没有按预期工作(至少 as described here

r<-ggmap::geocode(c('harvard university', 'the vatican'), output = 'more')
glimpse(r)

结果:

Rows: 2
Columns: 9
$ lon     <dbl> -71.117, 12.453
$ lat     <dbl> 42.377, 41.903
$ type    <chr> "establishment", "country"
$ loctype <chr> "geometric_center", "approximate"
$ address <chr> "cambridge, ma, usa", "00120, vatican city"
$ north   <dbl> 42.378, 41.907
$ south   <dbl> 42.376, 41.900
$ east    <dbl> -71.115, 12.458
$ west    <dbl> -71.118, 12.446

在这种情况下如何获取国家名称?

更新:

如果我提供一个 output='all' 参数,它 returns 一个很长的嵌套列表,其中 address_components 中有一个国家名称。检索它的最有效方法是什么?

代码:

r<-ggmap::geocode(c('harvard university', 'the vatican'), output = 'all')
glimpse(r)

结果:

> str(r )
tibble [1 × 2] (S3: tbl_df/tbl/data.frame)
 $ results:List of 1
  ..$ :List of 5
  .. ..$ address_components:List of 3
  .. .. ..$ :List of 3
  .. .. .. ..$ long_name : chr "New York"
  .. .. .. ..$ short_name: chr "New York"
  .. .. .. ..$ types     :List of 2
  .. .. .. .. ..$ : chr "locality"
  .. .. .. .. ..$ : chr "political"
  .. .. ..$ :List of 3
  .. .. .. ..$ long_name : chr "New York"
  .. .. .. ..$ short_name: chr "NY"
  .. .. .. ..$ types     :List of 2
  .. .. .. .. ..$ : chr "administrative_area_level_1"
  .. .. .. .. ..$ : chr "political"
  .. .. ..$ :List of 3
  .. .. .. ..$ long_name : chr "United States"
  .. .. .. ..$ short_name: chr "US"
  .. .. .. ..$ types     :List of 2
  .. .. .. .. ..$ : chr "country"
  .. .. .. .. ..$ : chr "political"
  .. ..$ formatted_address : chr "New York, NY, USA"
  .. ..$ geometry          :List of 4
  .. .. ..$ bounds       :List of 2
  .. .. .. ..$ northeast:List of 2
  .. .. .. .. ..$ lat: num 40.9
  .. .. .. .. ..$ lng: num -73.7
  .. .. .. ..$ southwest:List of 2
  .. .. .. .. ..$ lat: num 40.5
  .. .. .. .. ..$ lng: num -74.3
  .. .. ..$ location     :List of 2
  .. .. .. ..$ lat: num 40.7
  .. .. .. ..$ lng: num -74
  .. .. ..$ location_type: chr "APPROXIMATE"
  .. .. ..$ viewport     :List of 2
  .. .. .. ..$ northeast:List of 2
  .. .. .. .. ..$ lat: num 40.9
  .. .. .. .. ..$ lng: num -73.7
  .. .. .. ..$ southwest:List of 2
  .. .. .. .. ..$ lat: num 40.5
  .. .. .. .. ..$ lng: num -74.3
  .. ..$ place_id          : chr "ChIJOwg_06VPwokRYv534QaPC8g"
  .. ..$ types             :List of 2
  .. .. ..$ : chr "locality"
  .. .. ..$ : chr "political"
 $ status : chr "OK"

好的,这是一种非常糟糕的做法,但我想出了一种非常丑陋的方式(但它有效):


loc2country <- function(loc) {
  latlon <-
    ggmap::geocode(location = c(loc),
                   output = "latlon",
                   source = "google") %>% as_tibble()
  revgeo::revgeo(
    longitude = latlon$lon,
    latitude = latlon$lat,
    provider = 'google',
    API = api_key,
    output = 'hash',
    item = 'country'
  )$country
}

loc2country('Vienna')

returns:

[1] "Austria"

我认为 output = 'more' 论点更容易提取。最后一项(可能)总是国家。

r<-ggmap::geocode(c('harvard university', 'the vatican'), output = 'more') %>% 
  mutate(
    country = stringr::str_extract(address, "(?<=, )[^,]*$") # everything after last comma
  )


# # A tibble: 2 x 10
#    lon   lat type          loctype          address             north south  east  west country     
#   <dbl> <dbl> <chr>         <chr>            <chr>               <dbl> <dbl> <dbl> <dbl> <chr>       
# 1 -71.1  42.4 establishment geometric_center cambridge, ma, usa   42.4  42.4 -71.1 -71.1 usa         
# 2  12.5  41.9 country       approximate      00120, vatican city  41.9  41.9  12.5  12.4 vatican city

或者,您可以使用 purrr::pluck 从深层嵌套列表中提取 'country',尽管您可能需要做一些工作才能在 'address_components' 中找到合适的索引,因为它看起来有变化。

library(ggmap)
library(purrr)

r<-ggmap::geocode(c('harvard university', 'the vatican'), output = 'all')

# base r
r[[1]][[1]][[1]]$address_components[[3]]$long_name  # USA
r[[2]][[1]][[1]]$address_components[[1]]$long_name  # Vatican City

# purrr
purrr::pluck(r, 1, 1, 1, "address_components", 3, "long_name")
purrr::pluck(r, 2, 1, 1, "address_components", 1, "long_name")

# Result: 
# usa
# vatican city

编辑:这是一种更可靠的提取国家/地区名称的方法。

sapply(r, function(x) {
  # Simplify
  x_reduc <- x$results[[1]]$address_component

  # Loop for 'country' component
  for (i in seq_len(length(x_reduc[[1]]))) {
    component_type <- x_reduc[[i]]$types[[1]]
    if (component_type == 'country') return(x_reduc[[i]]$long_name)
  }
})