从字符串中提取地方分部并将其转换为 R 中的国家/地区名称
Extract subnational division from string and convert it into country name in R
我有一系列仅包含省 names/subnational 分区名称的字符串,我想将其转换为 R 中的国家名称向量。使用 countrycode
包提取国家名称相对容易,但是我没有看到使用该软件包将省份名称转换为国家/地区的方法。
例如:
provinces <- c("The governor of Florida", "The Premier of Ontario", "Jalisco has a province-wide policy")
我希望有一种方法可以将 provinces
向量转换为类似于 c("United States of America", "Canada", "Mexico")
的向量。
从上面的评论中,我意识到您可以在 countrycode
中使用自定义词典,它允许您合并地方数据。
编辑:
这是一个完全可重现的例子,因为最后一个例子没有完全起作用:
require(countrycode)
require(choroplethrAdmin1)
# example data
provinces <- c("The governor of Florida", "Tim Stevenson leads Oxfordshire", "Gobierno del Estado de Hidalgo")
# remove punctuation
provinces <- gsub("[[:punct:]\n]", "", provinces)
# load administrative division dictionary
data(admin1.regions)
# remove duplicate region names (countrycode function only accepts unique names)
admin1.regions <- admin1.regions[!duplicated(admin1.regions$region),]
# convert provinces to country
provinces_to_country <- countrycode(provinces, "region", "country", custom_dict = admin1.regions, origin_regex = TRUE)
旧的,不可重现的例子:
require(countrycode)
require(choroplethrAdmin1)
# example data
provinces <- c("The governor of Florida", "The Premier of Ontario", "Jalisco has a province-wide policy")
# remove punctuation
provinces <- gsub("[[:punct:]\n]", "", provinces)
# load administrative division dictionary
data(admin1.regions)
# remove duplicate region names (countrycode function only accepts unique names)
admin1.regions <- admin1.regions[!duplicated(admin1.regions$region),]
# convert provinces to country
provinces_to_country <- countrycode(provinces, "region", "country", custom_dict = admin1.regions, origin_regex = TRUE)
我有一系列仅包含省 names/subnational 分区名称的字符串,我想将其转换为 R 中的国家名称向量。使用 countrycode
包提取国家名称相对容易,但是我没有看到使用该软件包将省份名称转换为国家/地区的方法。
例如:
provinces <- c("The governor of Florida", "The Premier of Ontario", "Jalisco has a province-wide policy")
我希望有一种方法可以将 provinces
向量转换为类似于 c("United States of America", "Canada", "Mexico")
的向量。
从上面的评论中,我意识到您可以在 countrycode
中使用自定义词典,它允许您合并地方数据。
编辑:
这是一个完全可重现的例子,因为最后一个例子没有完全起作用:
require(countrycode)
require(choroplethrAdmin1)
# example data
provinces <- c("The governor of Florida", "Tim Stevenson leads Oxfordshire", "Gobierno del Estado de Hidalgo")
# remove punctuation
provinces <- gsub("[[:punct:]\n]", "", provinces)
# load administrative division dictionary
data(admin1.regions)
# remove duplicate region names (countrycode function only accepts unique names)
admin1.regions <- admin1.regions[!duplicated(admin1.regions$region),]
# convert provinces to country
provinces_to_country <- countrycode(provinces, "region", "country", custom_dict = admin1.regions, origin_regex = TRUE)
旧的,不可重现的例子:
require(countrycode)
require(choroplethrAdmin1)
# example data
provinces <- c("The governor of Florida", "The Premier of Ontario", "Jalisco has a province-wide policy")
# remove punctuation
provinces <- gsub("[[:punct:]\n]", "", provinces)
# load administrative division dictionary
data(admin1.regions)
# remove duplicate region names (countrycode function only accepts unique names)
admin1.regions <- admin1.regions[!duplicated(admin1.regions$region),]
# convert provinces to country
provinces_to_country <- countrycode(provinces, "region", "country", custom_dict = admin1.regions, origin_regex = TRUE)