如何使用 GeoPy 在 Pandas 中创建一个与坐标有距离的列
How to create a column in Pandas with distance from coordinates using GeoPy
我有这个 df:
latitude_1 longitude_1
0 -25.294871 -56.992654
1 -24.946374 -57.384543
2 -24.835273 -53.825342
3 -24.153553 -54.363844
以及以下坐标:
coords_2 = (-25.236632, -56.835262)
所以我想在 df 中创建第 3 列,显示每行之间的距离 coords_2。
如果我不使用 Dataframes 尝试这样做,它会起作用(这里我使用随机数):
import geopy.distance
coords_1 = (52.2296756, 21.0122287)
coords_2 = (43.263845, -42.2637377)
print(geopy.distance.distance(coords_1, coords_2).km)
输出:
4691.07078306837
所以我想将同样的逻辑应用到 Dataframe。
谢谢
如果你想将你的 df
坐标与一些外部坐标元组进行比较,试试这个:
import pandas as pd
import geopy.distance
df = pd.DataFrame(data={'latitude_1': [-25.294871, -24.946374], 'longitude_1': [-56.992654, -57.384543]})
coords_2 = (-25.236632, -56.835262)
df['distance'] = df.apply(lambda x: geopy.distance.distance((x.latitude_1, x. longitude_1), coords_2).km, axis=1)
latitude_1 longitude_1 distance
0 -25.294871 -56.992654 17.116773
1 -24.946374 -57.384543 64.062048
或 to_numpy()
:
def distance(l1, l2, coords_2):
return [geopy.distance.distance((lat, lng), coords_2).km for lat, lng in zip(l1, l2)]
df['distance'] = distance(df["latitude_1"].to_numpy(),df["longitude_1"].to_numpy(), coords_2)
你也可以在不使用 geopandas 的情况下做到这一点:
import pandas as pd
import geopy
import geopy.distance
def distance_from_custom_point(row):
start = geopy.Point(-25.236632, -56.835262)
end = geopy.Point(row['longitude'], row['latitude'])
row['dist'] = geopy.distance.distance(start, end).km
return row
df = pd.DataFrame({'latitude': [-25.294871, -24.946374, -24.835273, -24.153553],
'longitude': [-56.992654, -57.384543, -53.825342, -54.363844]})
df = df.apply(distance_from_custom_point, axis=1)
print(df)
您可以简单地将 lambda 函数应用于现有数据框
df['distance'] = df.apply(lambda x: geopy.distance.distance(x.tolist(), coords_2).km,
axis=1)
你应该得到:
latitude_1 longitude_1 distance
0 -25.294871 -56.992654 17.116773
1 -24.946374 -57.384543 64.062048
2 -24.835273 -53.825342 306.992429
3 -24.153553 -54.363844 277.379381
我有这个 df:
latitude_1 longitude_1
0 -25.294871 -56.992654
1 -24.946374 -57.384543
2 -24.835273 -53.825342
3 -24.153553 -54.363844
以及以下坐标:
coords_2 = (-25.236632, -56.835262)
所以我想在 df 中创建第 3 列,显示每行之间的距离 coords_2。
如果我不使用 Dataframes 尝试这样做,它会起作用(这里我使用随机数):
import geopy.distance
coords_1 = (52.2296756, 21.0122287)
coords_2 = (43.263845, -42.2637377)
print(geopy.distance.distance(coords_1, coords_2).km)
输出:
4691.07078306837
所以我想将同样的逻辑应用到 Dataframe。
谢谢
如果你想将你的 df
坐标与一些外部坐标元组进行比较,试试这个:
import pandas as pd
import geopy.distance
df = pd.DataFrame(data={'latitude_1': [-25.294871, -24.946374], 'longitude_1': [-56.992654, -57.384543]})
coords_2 = (-25.236632, -56.835262)
df['distance'] = df.apply(lambda x: geopy.distance.distance((x.latitude_1, x. longitude_1), coords_2).km, axis=1)
latitude_1 longitude_1 distance
0 -25.294871 -56.992654 17.116773
1 -24.946374 -57.384543 64.062048
或 to_numpy()
:
def distance(l1, l2, coords_2):
return [geopy.distance.distance((lat, lng), coords_2).km for lat, lng in zip(l1, l2)]
df['distance'] = distance(df["latitude_1"].to_numpy(),df["longitude_1"].to_numpy(), coords_2)
你也可以在不使用 geopandas 的情况下做到这一点:
import pandas as pd
import geopy
import geopy.distance
def distance_from_custom_point(row):
start = geopy.Point(-25.236632, -56.835262)
end = geopy.Point(row['longitude'], row['latitude'])
row['dist'] = geopy.distance.distance(start, end).km
return row
df = pd.DataFrame({'latitude': [-25.294871, -24.946374, -24.835273, -24.153553],
'longitude': [-56.992654, -57.384543, -53.825342, -54.363844]})
df = df.apply(distance_from_custom_point, axis=1)
print(df)
您可以简单地将 lambda 函数应用于现有数据框
df['distance'] = df.apply(lambda x: geopy.distance.distance(x.tolist(), coords_2).km,
axis=1)
你应该得到:
latitude_1 longitude_1 distance
0 -25.294871 -56.992654 17.116773
1 -24.946374 -57.384543 64.062048
2 -24.835273 -53.825342 306.992429
3 -24.153553 -54.363844 277.379381