将邮政编码映射到 R 中各自的城市和州?

Map zip codes to their respective city and state in R?

我有一个邮政编码数据框,我希望将其映射到每个特定邮政编码的城市和州。目前,我已经尝试了一下 zipcode 包,但我不确定是否可以解决这个特定问题。

这是我现在拥有的示例数据:

str(all_key$zip)
chr [1:406] "43031" "24517" "43224" "43832" "53022" "60185" "84104" "43081" 
"85226" "85193" "54656" "43215" "94533" "95826" "64804" "49548" "54467" 

预期的输出是将城市和州列添加到引用各个 zips 的数据框的每一行:

 head(all_key)
     zip city  state
1   43031 city1 state1
2   24517 city2 state2
3   43224 city3 state3
4   43832 city4 state4
5   53022 city5 state5
6   60185 city6 state6

在此先感谢您的帮助。

答案已更新

邮政编码包似乎已经消失了,所以这个答案已经更新以展示如何从外部文件添加经纬度。新答案在底部。


您可以从 zipcode 包中获取数据,然后进行合并以查找内容。

zip = c("43031", "24517", "43224", "43832", "53022", 
 "60185", "84104", "43081", "85226", "85193", "54656", 
 "43215", "94533", "95826", "64804", "49548", "54467")
ZC = data.frame(zip)

library(zipcode)
data(zipcode)
merge(ZC, zipcode)
     zip           city state latitude  longitude
1  24517      Altavista    VA 37.12754  -79.27409
2  43031      Johnstown    OH 40.15198  -82.66944
3  43081    Westerville    OH 40.10951  -82.91606
4  43215       Columbus    OH 39.96513  -83.00431
5  43224       Columbus    OH 40.03991  -82.96772
6  43832  Newcomerstown    OH 40.27738  -81.59662
7  49548   Grand Rapids    MI 42.86823  -85.66391
8  53022     Germantown    WI 43.21916  -88.12043
9  54467         Plover    WI 44.45228  -89.54399
10 54656         Sparta    WI 43.96977  -90.80796
11 60185   West Chicago    IL 41.89198  -88.20502
12 64804         Joplin    MO 37.04716  -94.51124
13 84104 Salt Lake City    UT 40.75063 -111.94077
14 85193    Casa Grande    AZ 32.86000 -111.83000
15 85226       Chandler    AZ 33.31221 -111.93177
16 94533      Fairfield    CA 38.26958 -122.03701
17 95826     Sacramento    CA 38.55010 -121.37492

如果您需要保持行的顺序相同,您只需在邮政编码数据上设置行名并将其用于 select 所需的行和列。

rownames(zipcode) = zipcode$zip
zipcode[zip, 1:3]
        zip           city state
43031 43031      Johnstown    OH
24517 24517      Altavista    VA
43224 43224       Columbus    OH
43832 43832  Newcomerstown    OH
53022 53022     Germantown    WI
60185 60185   West Chicago    IL
84104 84104 Salt Lake City    UT
43081 43081    Westerville    OH
85226 85226       Chandler    AZ
85193 85193    Casa Grande    AZ
54656 54656         Sparta    WI
43215 43215       Columbus    OH
94533 94533      Fairfield    CA
95826 95826     Sacramento    CA
64804 64804         Joplin    MO
49548 49548   Grand Rapids    MI
54467 54467         Plover    WI


更新答案

由于 zipcode 包已经消失,这显示了如何从下载的数据集中添加经纬度信息。我正在使用的文件存在 today 但该方法应该适用于其他文件。请参阅 GIS StackExchange 了解有关从何处下载数据的一些线索。

## Original Data to match
zip = c("43031", "24517", "43224", "43832", "53022", 
 "60185", "84104", "43081", "85226", "85193", "54656", 
 "43215", "94533", "95826", "64804", "49548", "54467")
ZC = data.frame(zip)

## Download source file, unzip and extract into table
ZipCodeSourceFile = "http://download.geonames.org/export/zip/US.zip"
temp <- tempfile()
download.file(ZipCodeSourceFile , temp)
ZipCodes <- read.table(unz(temp, "US.txt"), sep="\t")
unlink(temp)
names(ZipCodes) = c("CountryCode", "zip", "PlaceName", 
"AdminName1", "AdminCode1", "AdminName2", "AdminCode2", 
"AdminName3", "AdminCode3", "latitude", "longitude", "accuracy")

## merge extra info onto original data
fZC_Info = merge(ZC, ZipCodes[,c(2:6,10:11)])
head(ZC_Info)
    zip     PlaceName AdminName1 AdminCode1 AdminName2 latitude longitude
1 24517     Altavista   Virginia         VA   Campbell  37.1222  -79.2911
2 43031     Johnstown       Ohio         OH    Licking  40.1445  -82.6973
3 43081   Westerville       Ohio         OH   Franklin  40.1146  -82.9105
4 43215      Columbus       Ohio         OH   Franklin  39.9671  -83.0044
5 43224      Columbus       Ohio         OH   Franklin  40.0425  -82.9689
6 43832 Newcomerstown       Ohio         OH Tuscarawas  40.2739  -81.5940

您仍然可以通过从存档下载 "zipcode" 包来使用它 https://cran.r-project.org/src/contrib/Archive/zipcode/

将 tar.gz 文件下载到您的计算机后,您可以从 RStudio GUI 包窗格安装它。单击 "Install" 后,您可以将选项更改为 "Package Archive File" 并指向下载的 tar.gz 文件。

Install/use USA package, also described here,其中包含存档邮政编码包中的小标题(拉链和 lats/longs)。

library(usa)
zcs <- usa::zipcodes
head(zcs)

# A tibble: 6 x 5
  zip   city       state   lat  long
  <chr> <chr>      <chr> <dbl> <dbl>
1 00210 Portsmouth NH     43.0 -71.0
2 00211 Portsmouth NH     43.0 -71.0
3 00212 Portsmouth NH     43.0 -71.0
4 00213 Portsmouth NH     43.0 -71.0
5 00214 Portsmouth NH     43.0 -71.0
6 00215 Portsmouth NH     43.0 -71.0

可以使用R包中的数据框zipcodeR

要将城市和州添加到您的数据框,您可以 select 从 zipcodeR 提供的数据框(称为 zip_code_db)中选择您想要的变量,然后将其与您的数据框连接:

library(dplyr)
library(zipcodeR)

zip_code_db_selected =
  zip_code_db %>% 
  select(zipcode, major_city, state)

all_key_with_city_st = 
  left_join(all_key, zip_code_db_selected, by = c("zip" = "zipcode"))