Python, Pandas: tz_localize AmbiguousTimeError: Cannot infer dst time with non DST dates
Python, Pandas: tz_localize AmbiguousTimeError: Cannot infer dst time with non DST dates
我正在尝试导入一些时间序列数据并将其转换为 UTC,以便我可以将其与另一个数据集合并。这个数据好像是24小时数据,没有夏令时调整。 This post 给出了类似的答案,但他们只是放弃了这一行。我需要移动它以便与我的其他数据合并。
当我 运行 我的代码:
df = pd.read_csv('http://rredc.nrel.gov/solar/old_data/nsrdb/1991-2010/data/hourly/{}/{}_{}_solar.csv'.format(723898,723898,1998), usecols=["YYYY-MM-DD", "HH:MM (LST)","Meas Glo (Wh/m^2)","Meas Dir (Wh/m^2)","Meas Dif (Wh/m^2)"])
def clean_time(obj):
hour = int(obj[0:-3])
hour = str(hour - 1)
if len(str(hour)) == 2:
return hour+":00"
else:
return "0" + hour + ":00"
df['HH:MM (LST)'] = df['HH:MM (LST)'].apply(clean_time)
df['DateTime'] = df['YYYY-MM-DD'] + " " + df['HH:MM (LST)']
df = df.set_index(pd.DatetimeIndex(df['DateTime']))
df.drop(["YYYY-MM-DD", "HH:MM (LST)",'DateTime'],axis=1,inplace=True)
df.index = df.index.tz_localize('US/Pacific', ambiguous='infer')
我得到:
pytz.exceptions.AmbiguousTimeError: Cannot infer dst time from 1998-10-25 01:00:00 as there are no repeated times
如果我保留 ambiguous='raise'(默认值),它会给我:
pytz.exceptions.NonExistentTimeError: 1998-04-05 02:00:00
所以我卡在了夏令时的开始或结束时间。
我需要合并其中相当多的这些数据集(多年的多个站点),所以我不想手动编写特定时间的代码来轮班,但我还是个新手,不能完全在这里找出我的下一步。
感谢您的帮助!
最小再现场景:
from datetime import datetime, timedelta
import pandas as pd
df = pd.DataFrame([[datetime(2019, 10, 27, 0) + timedelta(hours=i), i] for i in range(24)], columns=['dt', 'i']).set_index('dt')
df.index.tz_localize('Europe/Amsterdam', ambiguous='infer')
pytz.exceptions.AmbiguousTimeError: Cannot infer dst time from 2019-10-27 02:00:00 as there are no repeated times
解决方案:手动指定哪些日期时间对象必须被视为 DT(夏令时)或 DST(夏令时)。参见 documentation。
from datetime import datetime, timedelta
import numpy as np
import pandas as pd
df = pd.DataFrame([[datetime(2019, 10, 27, 0) + timedelta(hours=i), i] for i in range(24)], columns=['dt', 'i']).set_index('dt')
infer_dst = np.array([False] * df.shape[0]) # all False -> every row considered DT, alternative is True to indicate DST. The array must correspond to the iloc of df.index
df.index.tz_localize('Europe/Amsterdam', ambiguous=infer_dst) # no error
我正在尝试导入一些时间序列数据并将其转换为 UTC,以便我可以将其与另一个数据集合并。这个数据好像是24小时数据,没有夏令时调整。 This post 给出了类似的答案,但他们只是放弃了这一行。我需要移动它以便与我的其他数据合并。
当我 运行 我的代码:
df = pd.read_csv('http://rredc.nrel.gov/solar/old_data/nsrdb/1991-2010/data/hourly/{}/{}_{}_solar.csv'.format(723898,723898,1998), usecols=["YYYY-MM-DD", "HH:MM (LST)","Meas Glo (Wh/m^2)","Meas Dir (Wh/m^2)","Meas Dif (Wh/m^2)"])
def clean_time(obj):
hour = int(obj[0:-3])
hour = str(hour - 1)
if len(str(hour)) == 2:
return hour+":00"
else:
return "0" + hour + ":00"
df['HH:MM (LST)'] = df['HH:MM (LST)'].apply(clean_time)
df['DateTime'] = df['YYYY-MM-DD'] + " " + df['HH:MM (LST)']
df = df.set_index(pd.DatetimeIndex(df['DateTime']))
df.drop(["YYYY-MM-DD", "HH:MM (LST)",'DateTime'],axis=1,inplace=True)
df.index = df.index.tz_localize('US/Pacific', ambiguous='infer')
我得到:
pytz.exceptions.AmbiguousTimeError: Cannot infer dst time from 1998-10-25 01:00:00 as there are no repeated times
如果我保留 ambiguous='raise'(默认值),它会给我:
pytz.exceptions.NonExistentTimeError: 1998-04-05 02:00:00
所以我卡在了夏令时的开始或结束时间。
我需要合并其中相当多的这些数据集(多年的多个站点),所以我不想手动编写特定时间的代码来轮班,但我还是个新手,不能完全在这里找出我的下一步。
感谢您的帮助!
最小再现场景:
from datetime import datetime, timedelta
import pandas as pd
df = pd.DataFrame([[datetime(2019, 10, 27, 0) + timedelta(hours=i), i] for i in range(24)], columns=['dt', 'i']).set_index('dt')
df.index.tz_localize('Europe/Amsterdam', ambiguous='infer')
pytz.exceptions.AmbiguousTimeError: Cannot infer dst time from 2019-10-27 02:00:00 as there are no repeated times
解决方案:手动指定哪些日期时间对象必须被视为 DT(夏令时)或 DST(夏令时)。参见 documentation。
from datetime import datetime, timedelta
import numpy as np
import pandas as pd
df = pd.DataFrame([[datetime(2019, 10, 27, 0) + timedelta(hours=i), i] for i in range(24)], columns=['dt', 'i']).set_index('dt')
infer_dst = np.array([False] * df.shape[0]) # all False -> every row considered DT, alternative is True to indicate DST. The array must correspond to the iloc of df.index
df.index.tz_localize('Europe/Amsterdam', ambiguous=infer_dst) # no error