R - 遍历 ID 和邮政编码的 df 以查找下一个最近的商店 (lat/longitude)、return 商店 ID 列表和下一个最近的商店

R - Loop through df of IDs and Zipcodes to find next closest store (lat/longitude), return list of Store ID, and next closest Store

我已经使用 zipcode 包根据邮政编码获取了一堆商店的邮政编码的纬度和经度。

我希望找到一种循环遍历列表的方法,并针对 5000 家商店中的每家商店,根据 Long/Lat 找到下一个最近的商店。

我目前有这个数据框(为此 post 删除了值):

'data.frame':   1206 obs. of  6 variables:
 $ zip      : Factor w/ 1182 levels "86645","43225",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ ID       : int  
 $ city     : chr  
 $ state    : chr  
 $ latitude : num  
 $ longitude: num  

这是我能想到的一种解决方案:

library(data.table)
library(zipcode)
library(geosphere)
data(zipcode)
set.seed(151)
n <- 100
storeData <- data.table(storeID=sample(1:100000,n,replace = FALSE),zip=sample(zipcode$zip,n,replace = TRUE))
zipcode <- data.table(zipcode,key = "zip")
storeData <- zipcode[storeData,on="zip"][!is.na(latitude)|!is.na(longitude)]
storeData
storeData
# zip             city state latitude  longitude storeID
# 1: 22408   Fredericksburg    VA 38.23602  -77.46111   47945
# 2: 44515       Youngstown    OH 41.09901  -80.74545   86541
# 3: 48112       Belleville    MI 42.23993  -83.15082   77807
# 4: 80154        Englewood    CO 39.73875 -104.40835   53862
# 5: 73766       Pond Creek    OK 36.66271  -97.83063   44166
# 6: 32321          Bristol    FL 30.36007  -84.97668   61377
# 7: 49442         Muskegon    MI 43.23262  -86.19550   45492
# 8: 04537         Boothbay    ME 43.90781  -69.64608   82087
storeDistances <- distm(storeData[,.(longitude,latitude)],storeData[,.(longitude,latitude)])
colnames(storeDistances) <- rownames(storeDistances) <- storeData[,storeID]
getClosest <- function(number=1){
  apply(storeDistances,1,function(x) (colnames(storeDistances)[which(x==sort(x)[number+1])]))
}
storeData[,firstClosest:=getClosest(1)]
storeData[,secondClosest:=getClosest(2)]
storeData[,thirdClosest:=getClosest(3)]
storeData
# zip             city state latitude  longitude storeID firstClosest secondClosest
# 1: 22408   Fredericksburg    VA 38.23602  -77.46111   47945        70091         41024
# 2: 44515       Youngstown    OH 41.09901  -80.74545   86541        10806         78898
# 3: 48112       Belleville    MI 42.23993  -83.15082   77807        25906         94780
# 4: 80154        Englewood    CO 39.73875 -104.40835   53862        22347         91392
# 5: 73766       Pond Creek    OK 36.66271  -97.83063   44166         4816         90090
# 6: 32321          Bristol    FL 30.36007  -84.97668   61377         8187          1937
# 7: 49442         Muskegon    MI 43.23262  -86.19550   45492        95486         97241
# 8: 04537         Boothbay    ME 43.90781  -69.64608   82087        46720          7013
# 
# thirdClosest
# 1:        57562
# 2:        71232
# 3:        86541
# 4:        97986
# 5:          146
# 6:         8113
# 7:         6400
# 8:        10872

storeDistances 是每个商店之间的距离矩阵。 getClosest 函数获取最近的商店。