Pandas function .apply() not passing arguments "ValueError: Point coordinates must be finite. (nan, nan, 0.0) has been passed as coordinates."
Pandas function .apply() not passing arguments "ValueError: Point coordinates must be finite. (nan, nan, 0.0) has been passed as coordinates."
Python 硕士
我正在尝试使用 pandas .apply() 函数加速我的代码。
但是,我遇到了一个我不知道如何解决的问题。
该脚本的主要目标是遍历 DataFrame 并确定地图上两点之间的距离。为此,我正在使用 geopy 库并构建函数:
def distance_2points(lat1, long1, lat2, long2):
coord1 = (lat1, long1)
coord2 = (lat2, long2)
results = distance.distance(coord1, coord2).km
return results
当我测试功能时它没有问题,但是当我尝试使用它时 with.apply() 我得到
ValueError:点坐标必须是有限的。 (nan, nan, 0.0) 已作为坐标传递。
完整代码
from geopy import distance
import pandas as pd
from datetime import datetime
import time
startTime = datetime.now()
print(datetime.now() - startTime)
lat1 = 40.067982
long1 = -75.056641
def distance_2points(lat1, long1, lat2, long2):
coord1 = (lat1, long1)
coord2 = (lat2, long2)
results = distance.distance(coord1, coord2).km
return results
df = pd.read_csv('data.csv')
df['distance'] = df.apply(lambda row: distance_2points(lat1, long1, lat2=row['lat'], long2=row['long'] ), axis=1)
print(datetime.now() - startTime)
谁能解释一下这是什么问题?
数据示例
https://docs.google.com/spreadsheets/d/11sahfFQcv_PcODUvFxe6ziY_TeBjDkfLCpf2baqEKck/edit?usp=sharing
试试这个:
from geopy import distance
import pandas as pd
from datetime import datetime
import time
startTime = datetime.now()
print(datetime.now() - startTime)
lat1 = 40.067982
long1 = -75.056641
def distance_2points(row):
coord1 = (lat1, long1)
coord2 = (row['lat'], row['long'])
results = distance.distance(coord1, coord2).km
return results
df = pd.read_csv('data.csv')
df['distance'] = df.apply(lambda row: distance_2points(row), axis=1)
print(datetime.now() - startTime)
事实上,您可以通过将命名函数直接应用于数据框而不使用 lambda 来进一步简化此操作:
df['distance'] = df.apply(distance_2points, axis=1)
Python 硕士
我正在尝试使用 pandas .apply() 函数加速我的代码。
但是,我遇到了一个我不知道如何解决的问题。
该脚本的主要目标是遍历 DataFrame 并确定地图上两点之间的距离。为此,我正在使用 geopy 库并构建函数:
def distance_2points(lat1, long1, lat2, long2):
coord1 = (lat1, long1)
coord2 = (lat2, long2)
results = distance.distance(coord1, coord2).km
return results
当我测试功能时它没有问题,但是当我尝试使用它时 with.apply() 我得到
ValueError:点坐标必须是有限的。 (nan, nan, 0.0) 已作为坐标传递。
完整代码
from geopy import distance
import pandas as pd
from datetime import datetime
import time
startTime = datetime.now()
print(datetime.now() - startTime)
lat1 = 40.067982
long1 = -75.056641
def distance_2points(lat1, long1, lat2, long2):
coord1 = (lat1, long1)
coord2 = (lat2, long2)
results = distance.distance(coord1, coord2).km
return results
df = pd.read_csv('data.csv')
df['distance'] = df.apply(lambda row: distance_2points(lat1, long1, lat2=row['lat'], long2=row['long'] ), axis=1)
print(datetime.now() - startTime)
谁能解释一下这是什么问题?
数据示例 https://docs.google.com/spreadsheets/d/11sahfFQcv_PcODUvFxe6ziY_TeBjDkfLCpf2baqEKck/edit?usp=sharing
试试这个:
from geopy import distance
import pandas as pd
from datetime import datetime
import time
startTime = datetime.now()
print(datetime.now() - startTime)
lat1 = 40.067982
long1 = -75.056641
def distance_2points(row):
coord1 = (lat1, long1)
coord2 = (row['lat'], row['long'])
results = distance.distance(coord1, coord2).km
return results
df = pd.read_csv('data.csv')
df['distance'] = df.apply(lambda row: distance_2points(row), axis=1)
print(datetime.now() - startTime)
事实上,您可以通过将命名函数直接应用于数据框而不使用 lambda 来进一步简化此操作:
df['distance'] = df.apply(distance_2points, axis=1)