如何计算 pandas 数据框中两个站的纬度和经度之间的距离
How to calculate the distance between latitudes and longitudes of two stations in a pandas dataframe
我有一个包含站信息的数据框,包括纬度和经度,如下所示:
start_lat start_lng end_lat end_lng
41.877726 -87.654787 41.888716 -87.644448
41.930000 -87.700000 41.910000 -87.700000
41.910000 -87.690000 41.930000 -87.700000
和聪明人一样。
我想根据这些信息创建一个距离列,其中这些起点和终点之间的距离可以是公里或英里。
(如下亲分享,当我尝试执行SO答案时遇到错误。)
from math import sin, cos, sqrt, atan2
dlon = data.end_lng - data.start_lng
dlat = data.end_lat - data.start_lat
a = ((sin(dlat/2))**2 + cos(lat1) * cos(lat2) * (sin(dlon/2))**2)
c = 2 * atan2(sqrt(a), sqrt(1-a))
data['distance'] = R * c
TypeError Traceback (most recent call last)
<ipython-input-8-a8f8b698a81b> in <module>()
2 dlon = data.end_lng - data.start_lng
3 dlat = data.end_lat - data.start_lat
----> 4 a = ((sin(dlat/2))**2 + cos(lat1) * cos(lat2) * (sin(dlon/2))**2).apply(lambda x: float(x))
5 c = 2 * atan2(sqrt(a), sqrt(1-a))
6 data['distance'] = R * c
/usr/local/lib/python3.7/dist-packages/pandas/core/series.py in wrapper(self)
127 if len(self) == 1:
128 return converter(self.iloc[0])
--> 129 raise TypeError(f"cannot convert the series to {converter}")
130
131 wrapper.__name__ = f"__{converter.__name__}__"
TypeError: cannot convert the series to <class 'float'>
如何解决?
您需要对每一行进行计算,一种方法是使用 itterows(不保证距离计算本身):
def get_distance(row, R = 6371): #km
dlon = row[1]['end_lng'] - row[1]['start_lng']
dlat = row[1]['end_lat'] - row[1]['start_lat']
a = ((sin(dlat/2))**2 + cos(row[1]['start_lat']) * cos(row[1]['end_lat']) * (sin(dlon/2))**2)
c = 2 * atan2(sqrt(a), sqrt(1-a))
return R * c
data['distance'] = [get_distance(row) for row in data.iterrows()]
我有一个包含站信息的数据框,包括纬度和经度,如下所示:
start_lat start_lng end_lat end_lng
41.877726 -87.654787 41.888716 -87.644448
41.930000 -87.700000 41.910000 -87.700000
41.910000 -87.690000 41.930000 -87.700000
和聪明人一样。
我想根据这些信息创建一个距离列,其中这些起点和终点之间的距离可以是公里或英里。
(如下亲分享,当我尝试执行SO答案时遇到错误。)
from math import sin, cos, sqrt, atan2
dlon = data.end_lng - data.start_lng
dlat = data.end_lat - data.start_lat
a = ((sin(dlat/2))**2 + cos(lat1) * cos(lat2) * (sin(dlon/2))**2)
c = 2 * atan2(sqrt(a), sqrt(1-a))
data['distance'] = R * c
TypeError Traceback (most recent call last)
<ipython-input-8-a8f8b698a81b> in <module>()
2 dlon = data.end_lng - data.start_lng
3 dlat = data.end_lat - data.start_lat
----> 4 a = ((sin(dlat/2))**2 + cos(lat1) * cos(lat2) * (sin(dlon/2))**2).apply(lambda x: float(x))
5 c = 2 * atan2(sqrt(a), sqrt(1-a))
6 data['distance'] = R * c
/usr/local/lib/python3.7/dist-packages/pandas/core/series.py in wrapper(self)
127 if len(self) == 1:
128 return converter(self.iloc[0])
--> 129 raise TypeError(f"cannot convert the series to {converter}")
130
131 wrapper.__name__ = f"__{converter.__name__}__"
TypeError: cannot convert the series to <class 'float'>
如何解决?
您需要对每一行进行计算,一种方法是使用 itterows(不保证距离计算本身):
def get_distance(row, R = 6371): #km
dlon = row[1]['end_lng'] - row[1]['start_lng']
dlat = row[1]['end_lat'] - row[1]['start_lat']
a = ((sin(dlat/2))**2 + cos(row[1]['start_lat']) * cos(row[1]['end_lat']) * (sin(dlon/2))**2)
c = 2 * atan2(sqrt(a), sqrt(1-a))
return R * c
data['distance'] = [get_distance(row) for row in data.iterrows()]