为什么我已经定义了全局名称就报错了?
Why do I have an error when the global name has been defined?
我正在尝试使用 python 按照答案 中的步骤计算时间排序坐标之间的距离和速度。在代码的末尾,我遇到了一个错误,它说全局名称尚未定义,但显然已经定义了。
这是我的数据样本
ID timestamp latitude longitude
0 72 20/01/2015 09:47 -6.646405565 71.35696828
1 72 20/01/2015 15:47 -6.642237759 71.36032005
2 72 20/01/2015 21:47 -6.639229675 71.36914769
3 73 21/01/2015 03:47 -6.648699053 71.37865551
4 73 21/01/2015 09:47 -6.65574147 71.37957366
5 74 21/01/2015 15:47 -6.660118996 71.37990588
6 74 21/01/2015 21:47 -6.666138734 71.38266541
到目前为止我已经能够运行下面的代码
import pandas as pd
df = pd.read_csv(filename)
df['timestamp'] = pd.to_datetime(df['timestamp'], format='%d/%m/%Y %H:%M')
from math import sin, cos, sqrt, atan2, radians
def getDistanceFromLatLonInKm(lat1,lon1,lat2,lon2):
R = 6371 # Radius of the earth in km
dLat = radians(lat2-lat1)
dLon = radians(lon2-lon1)
rLat1 = radians(lat1)
rLat2 = radians(lat2)
a = sin(dLat/2) * sin(dLat/2) + cos(rLat1) * cos(rLat2) * sin(dLon/2) * sin(dLon/2)
c = 2 * atan2(sqrt(a), sqrt(1-a))
d = R * c # Distance in km
return d
def calc_velocity(dist_km, time_start, time_end):
"""Return 0 if time_start == time_end, avoid dividing by 0"""
return dist_km / (time_end - time_start).seconds if time_end > time_start else 0
# First sort by ID and timestamp:
df = df.sort_values(by=['ID', 'timestamp'])
# Group the sorted dataframe by ID, and grab the initial value for lat, lon, and time.
df['lat0'] = df.groupby('ID')['latitude'].transform(lambda x: x.iat[0])
df['lon0'] = df.groupby('ID')['longitude'].transform(lambda x: x.iat[0])
df['t0'] = df.groupby('ID')['timestamp'].transform(lambda x: x.iat[0])
# create a new column for distance
df['dist_km'] = df.apply(
lambda row: getDistanceFromLatLonInKm(
lat1=row['latitude'],
lon1=row['longitude'],
lat2=row['lat0'],
lon2=row['lon0']
),
axis=1
)
在这一点上,我得到一个错误,暗示 'getDistanceFromLatLonInKm'
虽然已经被定义,但还没有被定义。下面是回溯和错误
Traceback (most recent call last):
File "<pyshell#36>", line 9, in <module>
axis=1
File "C:\Python27\ArcGIS10.6\lib\site-packages\pandas\core\frame.py", line 4061, in apply
return self._apply_standard(f, axis, reduce=reduce)
File "C:\Python27\ArcGIS10.6\lib\site-packages\pandas\core\frame.py", line 4157, in _apply_standard
results[i] = func(v)
File "<pyshell#36>", line 3, in <lambda>
lambda row: getDistanceFromLatLonInKm(
NameError: ("global name 'getDistanceFromLatLonInKm' is not defined", u'occurred at index 0')
这段代码哪里出错了?
如果您需要了解有关执行 Python 代码的不同方式的背景知识,请查看此 link。 https://realpython.com/run-python-scripts/
将下面的代码复制粘贴到一个文件中,并将文件另存为lat_long.py。根据您的系统仅更改 csv 文件名 'lat_long.csv'。在 shell 或命令提示符下,执行命令:
pythonlat_long.py。
python 解释器将 运行 文件的内容 lat_long.py 并打印结果(如果有的话)。
import pandas as pd
from math import sin, cos, sqrt, atan2, radians
filename = 'lat_long.csv'
df = pd.read_csv(filename)
df['timestamp'] = pd.to_datetime(df['timestamp'], format='%d/%m/%Y %H:%M')
def getDistanceFromLatLonInKm(lat1,lon1,lat2,lon2):
R = 6371 # Radius of the earth in km
dLat = radians(lat2-lat1)
dLon = radians(lon2-lon1)
rLat1 = radians(lat1)
rLat2 = radians(lat2)
a = sin(dLat/2) * sin(dLat/2) + cos(rLat1) * cos(rLat2) * sin(dLon/2) * sin(dLon/2)
c = 2 * atan2(sqrt(a), sqrt(1-a))
d = R * c # Distance in km
return d
def calc_velocity(dist_km, time_start, time_end):
"""Return 0 if time_start == time_end, avoid dividing by 0"""
return dist_km / (time_end - time_start).seconds if time_end > time_start else 0
# First sort by ID and timestamp:
df = df.sort_values(by=['ID', 'timestamp'])
# Group the sorted dataframe by ID, and grab the initial value for lat, lon, and time.
df['lat0'] = df.groupby('ID')['latitude'].transform(lambda x: x.iat[0])
df['lon0'] = df.groupby('ID')['longitude'].transform(lambda x: x.iat[0])
df['t0'] = df.groupby('ID')['timestamp'].transform(lambda x: x.iat[0])
# create a new column for distance
df['dist_km'] = df.apply(
lambda row: getDistanceFromLatLonInKm(
lat1=row['latitude'],
lon1=row['longitude'],
lat2=row['lat0'],
lon2=row['lon0']
),
axis=1
)
print(df)
我正在尝试使用 python 按照答案
这是我的数据样本
ID timestamp latitude longitude
0 72 20/01/2015 09:47 -6.646405565 71.35696828
1 72 20/01/2015 15:47 -6.642237759 71.36032005
2 72 20/01/2015 21:47 -6.639229675 71.36914769
3 73 21/01/2015 03:47 -6.648699053 71.37865551
4 73 21/01/2015 09:47 -6.65574147 71.37957366
5 74 21/01/2015 15:47 -6.660118996 71.37990588
6 74 21/01/2015 21:47 -6.666138734 71.38266541
到目前为止我已经能够运行下面的代码
import pandas as pd
df = pd.read_csv(filename)
df['timestamp'] = pd.to_datetime(df['timestamp'], format='%d/%m/%Y %H:%M')
from math import sin, cos, sqrt, atan2, radians
def getDistanceFromLatLonInKm(lat1,lon1,lat2,lon2):
R = 6371 # Radius of the earth in km
dLat = radians(lat2-lat1)
dLon = radians(lon2-lon1)
rLat1 = radians(lat1)
rLat2 = radians(lat2)
a = sin(dLat/2) * sin(dLat/2) + cos(rLat1) * cos(rLat2) * sin(dLon/2) * sin(dLon/2)
c = 2 * atan2(sqrt(a), sqrt(1-a))
d = R * c # Distance in km
return d
def calc_velocity(dist_km, time_start, time_end):
"""Return 0 if time_start == time_end, avoid dividing by 0"""
return dist_km / (time_end - time_start).seconds if time_end > time_start else 0
# First sort by ID and timestamp:
df = df.sort_values(by=['ID', 'timestamp'])
# Group the sorted dataframe by ID, and grab the initial value for lat, lon, and time.
df['lat0'] = df.groupby('ID')['latitude'].transform(lambda x: x.iat[0])
df['lon0'] = df.groupby('ID')['longitude'].transform(lambda x: x.iat[0])
df['t0'] = df.groupby('ID')['timestamp'].transform(lambda x: x.iat[0])
# create a new column for distance
df['dist_km'] = df.apply(
lambda row: getDistanceFromLatLonInKm(
lat1=row['latitude'],
lon1=row['longitude'],
lat2=row['lat0'],
lon2=row['lon0']
),
axis=1
)
在这一点上,我得到一个错误,暗示 'getDistanceFromLatLonInKm'
虽然已经被定义,但还没有被定义。下面是回溯和错误
Traceback (most recent call last):
File "<pyshell#36>", line 9, in <module>
axis=1
File "C:\Python27\ArcGIS10.6\lib\site-packages\pandas\core\frame.py", line 4061, in apply
return self._apply_standard(f, axis, reduce=reduce)
File "C:\Python27\ArcGIS10.6\lib\site-packages\pandas\core\frame.py", line 4157, in _apply_standard
results[i] = func(v)
File "<pyshell#36>", line 3, in <lambda>
lambda row: getDistanceFromLatLonInKm(
NameError: ("global name 'getDistanceFromLatLonInKm' is not defined", u'occurred at index 0')
这段代码哪里出错了?
如果您需要了解有关执行 Python 代码的不同方式的背景知识,请查看此 link。 https://realpython.com/run-python-scripts/
将下面的代码复制粘贴到一个文件中,并将文件另存为lat_long.py。根据您的系统仅更改 csv 文件名 'lat_long.csv'。在 shell 或命令提示符下,执行命令:
pythonlat_long.py。
python 解释器将 运行 文件的内容 lat_long.py 并打印结果(如果有的话)。
import pandas as pd
from math import sin, cos, sqrt, atan2, radians
filename = 'lat_long.csv'
df = pd.read_csv(filename)
df['timestamp'] = pd.to_datetime(df['timestamp'], format='%d/%m/%Y %H:%M')
def getDistanceFromLatLonInKm(lat1,lon1,lat2,lon2):
R = 6371 # Radius of the earth in km
dLat = radians(lat2-lat1)
dLon = radians(lon2-lon1)
rLat1 = radians(lat1)
rLat2 = radians(lat2)
a = sin(dLat/2) * sin(dLat/2) + cos(rLat1) * cos(rLat2) * sin(dLon/2) * sin(dLon/2)
c = 2 * atan2(sqrt(a), sqrt(1-a))
d = R * c # Distance in km
return d
def calc_velocity(dist_km, time_start, time_end):
"""Return 0 if time_start == time_end, avoid dividing by 0"""
return dist_km / (time_end - time_start).seconds if time_end > time_start else 0
# First sort by ID and timestamp:
df = df.sort_values(by=['ID', 'timestamp'])
# Group the sorted dataframe by ID, and grab the initial value for lat, lon, and time.
df['lat0'] = df.groupby('ID')['latitude'].transform(lambda x: x.iat[0])
df['lon0'] = df.groupby('ID')['longitude'].transform(lambda x: x.iat[0])
df['t0'] = df.groupby('ID')['timestamp'].transform(lambda x: x.iat[0])
# create a new column for distance
df['dist_km'] = df.apply(
lambda row: getDistanceFromLatLonInKm(
lat1=row['latitude'],
lon1=row['longitude'],
lat2=row['lat0'],
lon2=row['lon0']
),
axis=1
)
print(df)