使用 Haversine 距离公式从经度和纬度获取距离
Getting distance from longitude and latitude using Haversine's distance formula
我在 pandas 数据框中工作,我正在尝试获取每个标识符的每个点的经度和纬度距离。
这是当前的数据框:
Identifier num_pts latitude longitude
0 AL011851 3 28.0 -94.8
1 AL011851 3 28.0 -95.4
2 AL011851 3 28.1 -96.0
3 AL021851 2 22.2 -97.6
4 AL021851 2 12.0 -60.0
我知道我必须使用 Haversine 距离公式,但我不确定如何使用我的数据合并它。
import numpy as np
def haversine(lon1, lat1, lon2, lat2, earth_radius=6367):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
All args must be of equal length.
"""
lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])
dlon = lon2 - lon1
dlat = lat2 - lat1
a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2
c = 2 * np.arcsin(np.sqrt(a))
km = earth_radius * c
return km
这应该是我在纸上仅使用纬度和经度计算得出的最终结果:
Identifier num_pts latitude longitude distance
0 AL011851 3 28.0 -94.8 NaN
1 AL011851 3 28.0 -95.4 58.870532
2 AL011851 3 28.1 -96.0 58.870532
3 AL021851 2 22.2 -97.6
4 AL021851 2 12.0 -60.0
编辑:我需要计算连续点之间的距离,例如 0 和 1,以及 2,并且必须按标识符对其进行分组以确保这些点不是来自不同的标识符,所以当有新的像 AL021851 这样的标识符它会重置并只计算该标识符中的点
from io import StringIO
import pandas as pd
# Example data
df = pd.read_fwf(StringIO("""
Identifier num_pts latitude longitude
AL011851 3 28.0 -94.8
AL011851 3 28.0 -95.4
AL011851 3 28.1 -96.0
AL021851 2 22.2 -97.6
AL021851 2 12.0 -60.0
"""), header=1)
# Provided function
import numpy as np
def haversine(lon1, lat1, lon2, lat2, earth_radius=6367):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
All args must be of equal length.
"""
lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])
dlon = lon2 - lon1
dlat = lat2 - lat1
a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2
c = 2 * np.arcsin(np.sqrt(a))
km = earth_radius * c
return km
# Use pandas shift to place prior values on each row, within a grouped dataframe
dfg = df.groupby("Identifier")
df ['p_latitude'] = dfg['latitude'].shift(1)
df ['p_longitude'] = dfg['longitude'].shift(1)
# Assign to a new column - use pandas dataframe apply to invoke for each row
df['distance'] = df[['p_latitude', 'p_longitude', 'latitude','longitude']].apply(lambda x: haversine(x[1], x[0], x[3], x[2]), axis=1)
print(df)
# Identifier num_pts latitude longitude p_latitude p_longitude distance
#0 AL011851 3 28.0 -94.8 NaN NaN NaN
#1 AL011851 3 28.0 -95.4 28.0 -94.8 58.870532
#2 AL011851 3 28.1 -96.0 28.0 -95.4 59.883283
#3 AL021851 2 22.2 -97.6 NaN NaN NaN
#4 AL021851 2 12.0 -60.0 22.2 -97.6 4138.535287
我在 pandas 数据框中工作,我正在尝试获取每个标识符的每个点的经度和纬度距离。
这是当前的数据框:
Identifier num_pts latitude longitude
0 AL011851 3 28.0 -94.8
1 AL011851 3 28.0 -95.4
2 AL011851 3 28.1 -96.0
3 AL021851 2 22.2 -97.6
4 AL021851 2 12.0 -60.0
我知道我必须使用 Haversine 距离公式,但我不确定如何使用我的数据合并它。
import numpy as np
def haversine(lon1, lat1, lon2, lat2, earth_radius=6367):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
All args must be of equal length.
"""
lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])
dlon = lon2 - lon1
dlat = lat2 - lat1
a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2
c = 2 * np.arcsin(np.sqrt(a))
km = earth_radius * c
return km
这应该是我在纸上仅使用纬度和经度计算得出的最终结果:
Identifier num_pts latitude longitude distance
0 AL011851 3 28.0 -94.8 NaN
1 AL011851 3 28.0 -95.4 58.870532
2 AL011851 3 28.1 -96.0 58.870532
3 AL021851 2 22.2 -97.6
4 AL021851 2 12.0 -60.0
编辑:我需要计算连续点之间的距离,例如 0 和 1,以及 2,并且必须按标识符对其进行分组以确保这些点不是来自不同的标识符,所以当有新的像 AL021851 这样的标识符它会重置并只计算该标识符中的点
from io import StringIO
import pandas as pd
# Example data
df = pd.read_fwf(StringIO("""
Identifier num_pts latitude longitude
AL011851 3 28.0 -94.8
AL011851 3 28.0 -95.4
AL011851 3 28.1 -96.0
AL021851 2 22.2 -97.6
AL021851 2 12.0 -60.0
"""), header=1)
# Provided function
import numpy as np
def haversine(lon1, lat1, lon2, lat2, earth_radius=6367):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
All args must be of equal length.
"""
lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])
dlon = lon2 - lon1
dlat = lat2 - lat1
a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2
c = 2 * np.arcsin(np.sqrt(a))
km = earth_radius * c
return km
# Use pandas shift to place prior values on each row, within a grouped dataframe
dfg = df.groupby("Identifier")
df ['p_latitude'] = dfg['latitude'].shift(1)
df ['p_longitude'] = dfg['longitude'].shift(1)
# Assign to a new column - use pandas dataframe apply to invoke for each row
df['distance'] = df[['p_latitude', 'p_longitude', 'latitude','longitude']].apply(lambda x: haversine(x[1], x[0], x[3], x[2]), axis=1)
print(df)
# Identifier num_pts latitude longitude p_latitude p_longitude distance
#0 AL011851 3 28.0 -94.8 NaN NaN NaN
#1 AL011851 3 28.0 -95.4 28.0 -94.8 58.870532
#2 AL011851 3 28.1 -96.0 28.0 -95.4 59.883283
#3 AL021851 2 22.2 -97.6 NaN NaN NaN
#4 AL021851 2 12.0 -60.0 22.2 -97.6 4138.535287