计算两个列表的 haversine
computing haversine over two lists
我目前正在尝试计算我在 Geopandas 数据框中的 (lat/long) 坐标的路线距离。我对这个包裹还很陌生,但基本上我有几个点组成了一条路线,我想做的就是找到这条路线的总实际距离。我可以用两个固定点来做到这一点,我欠用户@steve clark 的帮助:
# Start
lon1 = 41.592181
lat1 = -87.638856
# End
lat2 = -86.754688
lon2 = 41.877575
def haversine(lat1, lon1, lat2, lon2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # Radius of earth in kilometers
print('Distance from beginning to end of route in km: ',round((c * r), 2),'\n')
我被两件事卡住了,我目前正在四处寻找是否可以计算与 Geopandas point()
对象的距离,但老实说,我找到的例子要么与我的无关问题,或者超出我的理解范围(目前)。
我可以将我的 gpd 中的纬度和经度列拉入列表,但我无法循环应用它
LatList = geo_data['latitude'].tolist()
LonList = geo_data['longitude'].tolist()
我尝试将迭代的内容追加到一个新列表中并对距离求和,但我最终得到一个具有相同值的列表追加了 2,850 次。感谢任何帮助或指导!
编辑:根据要求,这是失败的代码
distance = []
for i, j in zip(LatList, LonList):
dlat = i - i+1
dlon = j - j+1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # Radius of earth in kilometers
distance.append(round((c * r), 2))
print(distance)
您需要调整i
、i+1
、j
和j+1
的定义,否则循环将无法执行您想要的操作。
distance = []
LatLonList = list(zip(LatList, LonList))
# notice that if you do "for n in len(LatLonList)", the loop will fail in the last element
for n in len(LatLonList) -1:
dlat = LatLonList[n][0] - LatLonList[n+1][0] # this is for i
dlon = LatLonList[n][1] - LatLonList[n+1][1] # this is for j
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # Radius of earth in kilometers
distance.append(round((c * r), 2))
print(distance)
以geopandas参考中的数据为例:
import pandas as pd
import geopandas
from shapely.geometry import Point
df = pd.DataFrame(
{'City': ['Buenos Aires', 'Brasilia', 'Santiago', 'Bogota', 'Caracas'],
'Country': ['Argentina', 'Brazil', 'Chile', 'Colombia', 'Venezuela'],
'Latitude': [-34.58, -15.78, -33.45, 4.60, 10.48],
'Longitude': [-58.66, -47.91, -70.66, -74.08, -66.86]})
df['Coordinates'] = list(zip(df.Longitude, df.Latitude))
df['Coordinates'] = df['Coordinates'].apply(Point)
gdf = geopandas.GeoDataFrame(df, geometry='Coordinates')
以两点为输入的距离可以写成:
def haversine(point1, point2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
lon1, lat1 = point1.bounds[0], point1.bounds[1]
lon2, lat2 = point2.bounds[0], point2.bounds[1]
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # Radius of earth in kilometers
# print('Distance from beginning to end of route in km: ',round((c * r), 2),'\n')
return c * r
和使用pandas.DataFrame.apply
的计算:
gdf['Coordinates'].apply(lambda x: gdf['Coordinates'].apply(lambda y: haversine(x, y)))
编辑:
只计算矩阵的一半
gdf[['Coordinates']].apply(lambda x: gdf.loc[:x.name, 'Coordinates'].apply(lambda y: haversine(x['Coordinates'], y)), axis=1)
从 、
中找到此代码
def calculate_distance(positions):
results = []
for i in range(1, len(positions)):
loc1 = positions[i - 1]
loc2 = positions[i]
lat1 = loc1[0]
lng1 = loc1[1]
lat2 = loc2[0]
lng2 = loc2[1]
degreesToRadians = (math.pi / 180)
latrad1 = lat1 * degreesToRadians
latrad2 = lat2 * degreesToRadians
dlat = (lat2 - lat1) * degreesToRadians
dlng = (lng2 - lng1) * degreesToRadians
a = math.sin(dlat / 2) * math.sin(dlat / 2) + math.cos(latrad1) * \
math.cos(latrad2) * math.sin(dlng / 2) * math.sin(dlng / 2)
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
r = 6371000
results.append(r * c)
return (sum(results) / 1000) # Converting from m to km
还有一个计算Haversine距离的包
https://pypi.org/project/haversine/
它带有一个 numpy 版本,可以加快列表的计算速度。
我目前正在尝试计算我在 Geopandas 数据框中的 (lat/long) 坐标的路线距离。我对这个包裹还很陌生,但基本上我有几个点组成了一条路线,我想做的就是找到这条路线的总实际距离。我可以用两个固定点来做到这一点,我欠用户@steve clark 的帮助:
# Start
lon1 = 41.592181
lat1 = -87.638856
# End
lat2 = -86.754688
lon2 = 41.877575
def haversine(lat1, lon1, lat2, lon2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # Radius of earth in kilometers
print('Distance from beginning to end of route in km: ',round((c * r), 2),'\n')
我被两件事卡住了,我目前正在四处寻找是否可以计算与 Geopandas point()
对象的距离,但老实说,我找到的例子要么与我的无关问题,或者超出我的理解范围(目前)。
我可以将我的 gpd 中的纬度和经度列拉入列表,但我无法循环应用它
LatList = geo_data['latitude'].tolist()
LonList = geo_data['longitude'].tolist()
我尝试将迭代的内容追加到一个新列表中并对距离求和,但我最终得到一个具有相同值的列表追加了 2,850 次。感谢任何帮助或指导!
编辑:根据要求,这是失败的代码
distance = []
for i, j in zip(LatList, LonList):
dlat = i - i+1
dlon = j - j+1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # Radius of earth in kilometers
distance.append(round((c * r), 2))
print(distance)
您需要调整i
、i+1
、j
和j+1
的定义,否则循环将无法执行您想要的操作。
distance = []
LatLonList = list(zip(LatList, LonList))
# notice that if you do "for n in len(LatLonList)", the loop will fail in the last element
for n in len(LatLonList) -1:
dlat = LatLonList[n][0] - LatLonList[n+1][0] # this is for i
dlon = LatLonList[n][1] - LatLonList[n+1][1] # this is for j
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # Radius of earth in kilometers
distance.append(round((c * r), 2))
print(distance)
以geopandas参考中的数据为例:
import pandas as pd
import geopandas
from shapely.geometry import Point
df = pd.DataFrame(
{'City': ['Buenos Aires', 'Brasilia', 'Santiago', 'Bogota', 'Caracas'],
'Country': ['Argentina', 'Brazil', 'Chile', 'Colombia', 'Venezuela'],
'Latitude': [-34.58, -15.78, -33.45, 4.60, 10.48],
'Longitude': [-58.66, -47.91, -70.66, -74.08, -66.86]})
df['Coordinates'] = list(zip(df.Longitude, df.Latitude))
df['Coordinates'] = df['Coordinates'].apply(Point)
gdf = geopandas.GeoDataFrame(df, geometry='Coordinates')
以两点为输入的距离可以写成:
def haversine(point1, point2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
lon1, lat1 = point1.bounds[0], point1.bounds[1]
lon2, lat2 = point2.bounds[0], point2.bounds[1]
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # Radius of earth in kilometers
# print('Distance from beginning to end of route in km: ',round((c * r), 2),'\n')
return c * r
和使用pandas.DataFrame.apply
的计算:
gdf['Coordinates'].apply(lambda x: gdf['Coordinates'].apply(lambda y: haversine(x, y)))
编辑: 只计算矩阵的一半
gdf[['Coordinates']].apply(lambda x: gdf.loc[:x.name, 'Coordinates'].apply(lambda y: haversine(x['Coordinates'], y)), axis=1)
从
def calculate_distance(positions):
results = []
for i in range(1, len(positions)):
loc1 = positions[i - 1]
loc2 = positions[i]
lat1 = loc1[0]
lng1 = loc1[1]
lat2 = loc2[0]
lng2 = loc2[1]
degreesToRadians = (math.pi / 180)
latrad1 = lat1 * degreesToRadians
latrad2 = lat2 * degreesToRadians
dlat = (lat2 - lat1) * degreesToRadians
dlng = (lng2 - lng1) * degreesToRadians
a = math.sin(dlat / 2) * math.sin(dlat / 2) + math.cos(latrad1) * \
math.cos(latrad2) * math.sin(dlng / 2) * math.sin(dlng / 2)
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
r = 6371000
results.append(r * c)
return (sum(results) / 1000) # Converting from m to km
还有一个计算Haversine距离的包 https://pypi.org/project/haversine/ 它带有一个 numpy 版本,可以加快列表的计算速度。