如何在两个日期时间 Python 之间按小时正确生成 UTC 时间戳列表?

How to correctly generate list of UTC timestamps, by hour, between two datetimes Python?

我是 Python 的新手。经过几天的研究和尝试,我找到了一个不错的解决方案,可以为两个日期之间的每个小时创建一个时间戳列表。

示例:

import datetime
from datetime import datetime, timedelta

timestamp_format = '%Y-%m-%dT%H:%M:%S%z'

earliest_ts_str = '2020-10-01T15:00:00Z'
earliest_ts_obj = datetime.strptime(earliest_ts_str, timestamp_format)

latest_ts_str = '2020-10-02T00:00:00Z'
latest_ts_obj = datetime.strptime(latest_ts_str, timestamp_format)

num_days = latest_ts_obj - earliest_ts_obj
num_hours = int(round(num_days.total_seconds() / 3600,0))

ts_raw = []
for ts in range(num_hours):
    ts_raw.append(latest_ts_obj - timedelta(hours = ts + 1))

dates_formatted = [d.strftime('%Y-%m-%dT%H:%M:%SZ') for d in ts_raw]

# Need timestamps in ascending order
dates_formatted.reverse()

dates_formatted

这导致:

['2020-10-01T00:00:00Z',
 '2020-10-01T01:00:00Z',
 '2020-10-01T02:00:00Z',
 '2020-10-01T03:00:00Z',
 '2020-10-01T04:00:00Z',
 '2020-10-01T05:00:00Z',
 '2020-10-01T06:00:00Z',
 '2020-10-01T07:00:00Z',
 '2020-10-01T08:00:00Z',
 '2020-10-01T09:00:00Z',
 '2020-10-01T10:00:00Z',
 '2020-10-01T11:00:00Z',
 '2020-10-01T12:00:00Z',
 '2020-10-01T13:00:00Z',
 '2020-10-01T14:00:00Z',
 '2020-10-01T15:00:00Z',
 '2020-10-01T16:00:00Z',
 '2020-10-01T17:00:00Z',
 '2020-10-01T18:00:00Z',
 '2020-10-01T19:00:00Z',
 '2020-10-01T20:00:00Z',
 '2020-10-01T21:00:00Z',
 '2020-10-01T22:00:00Z',
 '2020-10-01T23:00:00Z']

问题:

结果:

['2020-10-01T20:00:00Z',
 '2020-10-01T21:00:00Z',
 '2020-10-01T22:00:00Z',
 '2020-10-01T23:00:00Z']

我需要它是:

['2020-10-01T20:45:00Z',
 '2020-10-01T21:45:00Z',
 '2020-10-01T22:45:00Z',
 '2020-10-01T23:45:00Z']

感觉问题出在 num_daysnum_hours 计算上,但我看不出如何解决。

想法?

改变一下

num_hours = num_days.days*24 + num_days.seconds//3600

问题是 num_days 只接受整数值,所以如果它不是 24 小时的倍数,您将得到底值(即对于您的示例,您将得到 0)。因此,为了计算您需要同时使用天数和秒数的小时数。

此外,您可以直接按正确的顺序创建列表,我不确定您是否出于某种原因这样做。

ts_raw.append(earliest_ts_obj + timedelta(hours = ts + 1))
import datetime
from datetime import datetime, timedelta

timestamp_format = '%Y-%m-%dT%H:%M:%S%z'

earliest_ts_str = '2020-10-01T00:00:00Z'
ts_obj = datetime.strptime(earliest_ts_str, timestamp_format)

latest_ts_str = '2020-10-02T00:00:00Z'
latest_ts_obj = datetime.strptime(latest_ts_str, timestamp_format)

ts_raw = []
while ts_obj <= latest_ts_obj:
    ts_raw.append(ts_obj)
    ts_obj += timedelta(hours=1)

dates_formatted = [d.strftime('%Y-%m-%dT%H:%M:%SZ') for d in ts_raw]
print(dates_formatted)

编辑:

这里是 Maya

的例子
import maya

earliest_ts_str = '2020-10-01T00:00:00Z'
latest_ts_str = '2020-10-02T00:00:00Z'
start = maya.MayaDT.from_iso8601(earliest_ts_str)
end = maya.MayaDT.from_iso8601(latest_ts_str)

# end is not included, so we add 1 second
my_range = maya.intervals(start=start, end=end.add(seconds=1), interval=60*60)
dates_formatted = [d.iso8601() for d in my_range]
print(dates_formatted)

双输出

['2020-10-01T00:00:00Z',
 '2020-10-01T01:00:00Z',
 ... some left out ...
 '2020-10-01T23:00:00Z',
 '2020-10-02T00:00:00Z']

如果您不介意使用第三方包,请查看 pandas.date_range:

import pandas as pd

earliest, latest = '2020-10-01T15:45:00Z', '2020-10-02T00:00:00Z'

dti = pd.date_range(earliest, latest, freq='H') # just specify hourly frequency...
l = dti.strftime('%Y-%m-%dT%H:%M:%SZ').to_list()
print(l)
# ['2020-10-01T15:45:00Z', '2020-10-01T16:45:00Z', '2020-10-01T17:45:00Z', '2020-10-01T18:45:00Z', '2020-10-01T19:45:00Z', '2020-10-01T20:45:00Z', '2020-10-01T21:45:00Z', '2020-10-01T22:45:00Z', '2020-10-01T23:45:00Z']