如何在 pandas 数据帧上使用 haversine 库使用 haversine 距离
How to use haversine distance using haversine library on pandas dataframe
这里使用我如何使用haversine库来计算两点之间的距离
import haversine as hs
hs.haversine((106.11333888888888,-1.94091666666667),(96.698661, 5.204783))
以下是使用 sklearn
计算半正弦距离的方法
from sklearn.metrics.pairwise import haversine_distances
import numpy as np
radian_1 = np.radians(df1[['lat','lon']])
radian_2 = np.radians(df2[['lat','lon']])
D = pd.DataFrame(haversine_distances(radian_1,radian_2)*6371,index=df1.index, columns=df2.index)
我需要做的是做类似的事情,但我使用 sklearn.metrics.pairwise
库,我使用 haversine
库
这是我的数据集df1
index lon lat
0 0 107.071969 -6.347778
1 1 110.431361 -7.773489
2 2 111.978469 -8.065442
和数据集df2
index lon lat
5 5 112.340919 -7.520442
6 6 107.179119 -6.291131
7 7 106.807442 -6.437383
这是预期的输出
5 6 7
0 596.019968 13.413123 30.882602
1 212.317223 394.942014 426.564799
2 72.573637 565.020998 598.409848
遵循以下文档和示例:sklearn.metrics.haversine
result = haversine_distances(np.radians(df_1[["lat","lon"]]), np.radians(df_2[["lat", "lon"]])) * 6371000/1000
result_df = pd.DataFrame(result, index = df_1["index"], columns=df_2["index"])
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th>index</th>
<th>5</th>
<th>6</th>
<th>7</th> </tr>
<tr>
<th>index</th>
<th></th>
<th></th>
<th></th> </tr> </thead>
<tbody>
<tr>
<th>0</th>
<td>596.019968</td>
<td>13.413123</td>
<td>30.882602</td> </tr>
<tr>
<th>1</th>
<td>212.317223</td>
<td>394.942014</td>
<td>426.564799</td> </tr>
<tr>
<th>2</th>
<td>72.573637</td>
<td>565.020998</td>
<td>598.409848</td> </tr> </tbody> </table>
您首先需要将纬度和经度转换为弧度,一旦得到距离,您需要乘以地球半径才能得到正确的距离。
您可以使用 itertools.product
创建所有案例,然后使用 haversine
获得如下结果:
import haversine as hs
import pandas as pd
import numpy as np
import itertools
res = []
for a,b in (itertools.product(*[df1.values , df2.values])):
res.append(hs.haversine(a,b))
m = int(np.sqrt(len(res)))
df = pd.DataFrame(np.asarray(res).reshape(m,m))
print(df)
输出:
0 1 2
0 587.500555 12.058061 29.557005
1 212.580742 365.487782 405.718803
2 46.333180 537.684789 578.072579
这里使用我如何使用haversine库来计算两点之间的距离
import haversine as hs
hs.haversine((106.11333888888888,-1.94091666666667),(96.698661, 5.204783))
以下是使用 sklearn
计算半正弦距离的方法from sklearn.metrics.pairwise import haversine_distances
import numpy as np
radian_1 = np.radians(df1[['lat','lon']])
radian_2 = np.radians(df2[['lat','lon']])
D = pd.DataFrame(haversine_distances(radian_1,radian_2)*6371,index=df1.index, columns=df2.index)
我需要做的是做类似的事情,但我使用 sklearn.metrics.pairwise
库,我使用 haversine
库
这是我的数据集df1
index lon lat
0 0 107.071969 -6.347778
1 1 110.431361 -7.773489
2 2 111.978469 -8.065442
和数据集df2
index lon lat
5 5 112.340919 -7.520442
6 6 107.179119 -6.291131
7 7 106.807442 -6.437383
这是预期的输出
5 6 7
0 596.019968 13.413123 30.882602
1 212.317223 394.942014 426.564799
2 72.573637 565.020998 598.409848
遵循以下文档和示例:sklearn.metrics.haversine
result = haversine_distances(np.radians(df_1[["lat","lon"]]), np.radians(df_2[["lat", "lon"]])) * 6371000/1000
result_df = pd.DataFrame(result, index = df_1["index"], columns=df_2["index"])
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th>index</th>
<th>5</th>
<th>6</th>
<th>7</th> </tr>
<tr>
<th>index</th>
<th></th>
<th></th>
<th></th> </tr> </thead>
<tbody>
<tr>
<th>0</th>
<td>596.019968</td>
<td>13.413123</td>
<td>30.882602</td> </tr>
<tr>
<th>1</th>
<td>212.317223</td>
<td>394.942014</td>
<td>426.564799</td> </tr>
<tr>
<th>2</th>
<td>72.573637</td>
<td>565.020998</td>
<td>598.409848</td> </tr> </tbody> </table>
您首先需要将纬度和经度转换为弧度,一旦得到距离,您需要乘以地球半径才能得到正确的距离。
您可以使用 itertools.product
创建所有案例,然后使用 haversine
获得如下结果:
import haversine as hs
import pandas as pd
import numpy as np
import itertools
res = []
for a,b in (itertools.product(*[df1.values , df2.values])):
res.append(hs.haversine(a,b))
m = int(np.sqrt(len(res)))
df = pd.DataFrame(np.asarray(res).reshape(m,m))
print(df)
输出:
0 1 2
0 587.500555 12.058061 29.557005
1 212.580742 365.487782 405.718803
2 46.333180 537.684789 578.072579