R:数据质量检查:与城市匹配的邮政编码
R: Data Quality Check: Zip Code matching the City
谁能帮我在 R 中实现一个想法?
我想实现,当 R 获得输入文件时,例如公司及其地址的列表,它将检查邮政编码是否适合每个公司的城市。我有一个来自某个国家/地区的所有城市和邮政编码的列表。如何将列表实现为 if 语句?
以前有人编程过类似的东西吗?
感谢您的帮助!
桑德拉
只是一个可以做什么的简单示例。但是,对您的城市使用模糊匹配可能更好。
# City codes (all city codes can be found at https://www.allareacodes.com/)
my_city_codes <- data.frame(code = c(201:206),
cities = c("Jersey City, NJ", "District of Columbia", "Bridgeport, CT", "Manitoba", "Birmingham, AL", "Seattle, WA"),
stringsAsFactors = FALSE)
# Function for checking if city/city-code matches those in the registries
adress_checker <- function(adress, citycodes) {
# Finding real city
real_city <- my_city_codes$cities[which(adress$code == my_city_codes$code)]
# Checking if cities are the same
if(real_city == adress$city) {
return("Correct city")
} else {
return("Incorrect city")
}
}
# Adresses to check
right_city <- data.frame(code = 205, city = c("Birmingham, AL"), stringsAsFactors = FALSE)
wrong_city <- data.frame(code = 205, city = c("Las Vegas"), stringsAsFactors = FALSE)
# Testing function
adress_checker(right_city, my_city_codes)
[1] "Correct city"
adress_checker(wrong_city, my_city_codes)
[1] "Incorrect city"
谁能帮我在 R 中实现一个想法?
我想实现,当 R 获得输入文件时,例如公司及其地址的列表,它将检查邮政编码是否适合每个公司的城市。我有一个来自某个国家/地区的所有城市和邮政编码的列表。如何将列表实现为 if 语句?
以前有人编程过类似的东西吗?
感谢您的帮助! 桑德拉
只是一个可以做什么的简单示例。但是,对您的城市使用模糊匹配可能更好。
# City codes (all city codes can be found at https://www.allareacodes.com/)
my_city_codes <- data.frame(code = c(201:206),
cities = c("Jersey City, NJ", "District of Columbia", "Bridgeport, CT", "Manitoba", "Birmingham, AL", "Seattle, WA"),
stringsAsFactors = FALSE)
# Function for checking if city/city-code matches those in the registries
adress_checker <- function(adress, citycodes) {
# Finding real city
real_city <- my_city_codes$cities[which(adress$code == my_city_codes$code)]
# Checking if cities are the same
if(real_city == adress$city) {
return("Correct city")
} else {
return("Incorrect city")
}
}
# Adresses to check
right_city <- data.frame(code = 205, city = c("Birmingham, AL"), stringsAsFactors = FALSE)
wrong_city <- data.frame(code = 205, city = c("Las Vegas"), stringsAsFactors = FALSE)
# Testing function
adress_checker(right_city, my_city_codes)
[1] "Correct city"
adress_checker(wrong_city, my_city_codes)
[1] "Incorrect city"