如何按行程 ID 分组并找到行驶的直线距离?
How to group by trip id and find the straight distance traveled?
我有以下数据:
Trip Start_Lat Start_Long End_lat End_Long Starting_point Ending_point
Trip_1 56.5624 -85.56845 58.568 45.568 A B
Trip_1 58.568 45.568 -200.568 -290.568 B C
Trip_1 -200.568 -290.568 56.5624 -85.56845 C D
Trip_2 56.5624 -85.56845 -85.56845 -200.568 A B
Trip_2 -85.56845 -200.568 -150.568 -190.568 B C
我想找到
的电路
Circuity = Total Distance Travelled(Trip A+B+C+D) - Straight line (Trip A to D)
-----------------------------------------------------------------------
Total Distance Traveled (Trip A+B+C+D)
我尝试了以下代码,
df['Distance']= df['flight_distance'] = df.apply(lambda x: great_circle((x['start_lat'], x['start_long']), (x['end_lat'], x['end_long'])).km, axis = 1)
df['Total_Distance'] = ((df.groupby('Trip')['distance'].shift(2) +['distance'].shift(1) + df['distance']).abs())
能帮我算一下直线距离和线路吗?
更新:
您可能想先将您的值转换为数字数据类型:
df[['Start_Lat','Start_Long','End_lat','End_Long']] = \
df[['Start_Lat','Start_Long','End_lat','End_Long']].apply(pd.to_numeric, errors='coerce')
IIUC 你可以这样做:
# vectorized haversine function
def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
"""
slightly modified version: of
Calculate the great circle distance between two points
on the earth (specified in decimal degrees or in radians)
All (lat, lon) coordinates must have numeric dtypes and be of equal length.
"""
if to_radians:
lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])
a = np.sin((lat2-lat1)/2.0)**2 + \
np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2
return earth_radius * 2 * np.arcsin(np.sqrt(a))
def f(df):
return 1 - haversine(df.iloc[0, 1], df.iloc[0, 2],
df.iloc[-1, 3], df.iloc[-1, 4]) \
/ \
haversine(df['Start_Lat'], df['Start_Long'],
df['End_lat'], df['End_Long']).sum()
df.groupby('Trip').apply(f)
结果:
In [120]: df.groupby('Trip').apply(f)
Out[120]:
Trip
Trip_1 1.000000
Trip_2 0.499825
dtype: float64
我有以下数据:
Trip Start_Lat Start_Long End_lat End_Long Starting_point Ending_point
Trip_1 56.5624 -85.56845 58.568 45.568 A B
Trip_1 58.568 45.568 -200.568 -290.568 B C
Trip_1 -200.568 -290.568 56.5624 -85.56845 C D
Trip_2 56.5624 -85.56845 -85.56845 -200.568 A B
Trip_2 -85.56845 -200.568 -150.568 -190.568 B C
我想找到
的电路 Circuity = Total Distance Travelled(Trip A+B+C+D) - Straight line (Trip A to D)
-----------------------------------------------------------------------
Total Distance Traveled (Trip A+B+C+D)
我尝试了以下代码,
df['Distance']= df['flight_distance'] = df.apply(lambda x: great_circle((x['start_lat'], x['start_long']), (x['end_lat'], x['end_long'])).km, axis = 1)
df['Total_Distance'] = ((df.groupby('Trip')['distance'].shift(2) +['distance'].shift(1) + df['distance']).abs())
能帮我算一下直线距离和线路吗?
更新:
您可能想先将您的值转换为数字数据类型:
df[['Start_Lat','Start_Long','End_lat','End_Long']] = \
df[['Start_Lat','Start_Long','End_lat','End_Long']].apply(pd.to_numeric, errors='coerce')
IIUC 你可以这样做:
# vectorized haversine function
def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
"""
slightly modified version: of
Calculate the great circle distance between two points
on the earth (specified in decimal degrees or in radians)
All (lat, lon) coordinates must have numeric dtypes and be of equal length.
"""
if to_radians:
lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])
a = np.sin((lat2-lat1)/2.0)**2 + \
np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2
return earth_radius * 2 * np.arcsin(np.sqrt(a))
def f(df):
return 1 - haversine(df.iloc[0, 1], df.iloc[0, 2],
df.iloc[-1, 3], df.iloc[-1, 4]) \
/ \
haversine(df['Start_Lat'], df['Start_Long'],
df['End_lat'], df['End_Long']).sum()
df.groupby('Trip').apply(f)
结果:
In [120]: df.groupby('Trip').apply(f)
Out[120]:
Trip
Trip_1 1.000000
Trip_2 0.499825
dtype: float64