Pandas :查找最近的经纬度对

Pandas : Find nearest lat-long pair

我有两个数据框

df1

   Address        lat         lon                                                                                                                                                                                
  store_12  30.375745  -87.679788                                                                                                                                                                                
 store_132  33.382099 -111.964918                                                                                                                                                                                
 store_134  32.374632 -111.100671                                                                                                                                                                                
  store_31  34.215678 -119.065539                                                                                                                                                                                
  store_23  33.126252 -117.321188   

df2

Address      lat       lon                                                                                                                                                                                       
 geo123  59.5119 -139.6711                                                                                                                                                                                       
 geo134  66.9161 -151.5089                                                                                                                                                                                       
 geo154  65.3700 -146.5900                                                                                                                                                                                       
 geo112  64.7408 -156.8756                                                                                                                                                                                       
 geo342  62.9575 -155.6103                                                                                                                                                                                       
 geo543  66.9500 -150.6700  

我已经使用 haversine 公式编写了我的代码,但它计算了所有可能对之间的距离,而我需要具有最小距离的对

import pandas as pd
from math import cos, asin, sqrt

d1 = {'Address':['store_12', 'store_132', 'store_134', 'store_31' ,'store_23'], 'lat':[30.3757446, 33.3820989, 32.3746316, 34.2156779,33.1262516], 'lon':[-87.6797877,-111.964918, -111.1006705, -119.0655388, -117.3211879]}

d2 = {'loc':['geo123', 'geo134', 'geo154', 'geo112' ,'geo342','geo543'], 'lat':[59.5119, 66.9161, 65.37, 64.7408,62.9575,66.95], 'lon':[-139.6711,-151.5089, -146.59,-156.8756, -155.6103, -150.67]}

df1 = pd.DataFrame(d1)
df2 = pd.DataFrame(d2)

new_df = pd.DataFrame(columns = ["Store","Location","Distance"])

def distance(lat1, lon1, lat2, lon2):
    p = 0.017453292519943295
    hav = 0.5 - cos((lat2-lat1)*p)/2 + cos(lat1*p)*cos(lat2*p) * (1-cos((lon2-lon1)*p)) / 2
    return 12742 * asin(sqrt(hav))


for index, row in df1.iterrows():
    id1 = row['Address']
    lat1  = row['lat']
    lon1  = row['lon']


    for index, row in df2.iterrows():
        id2 = row['loc']
        lat2 = row['lat']
        lon2 = row['lon']

        dist = distance(lat1,lon1,lat2,lon2)

        new_df = new_df.append({"Store":id1 , "Location":id2 , "Distance":dist},ignore_index = True)

        print(new_df)

如何获取 df2 中最接近(最小距离对)df1 中的位置?

你可以这样做:

new_df.loc[new_df.groupby('Store').Distance.idxmin()]

您首先按 Store 分组并获取每个组的最小值 Distance 的索引,然后获取数据帧的相应行。

看看

使用 pandas.mergecross 的一种方式:

new_df = df1.merge(df2, how="cross")
new_df["distance"] = new_df.filter(like="_").apply(lambda x: distance(*x), axis=1)
new_df.nsmallest(1, columns="distance")

输出:

     Address      lat_x       lon_x     loc    lat_y     lon_y     distance
18  store_31  34.215678 -119.065539  geo123  59.5119 -139.6711  3189.647959