使用 Pandas 在特定日期的固定时间间隔内创建每分钟 date_time 范围
Create minutely date_time range for fixed hours intervals during certain days using Pandas
我有一个包含 900
行的数据框,我需要为每一天创建一个从 2021-01-01
到 2021-01-15
的 date
列,从 10:00:00
到12:00:00
,区间为2 minutes
,
预期的结果会是这样的:2021-01-01 10:02:00, 2021-01-01 10:04:00, ..., 2021-01-01 12:00:00, 2021-01-02 10:02:00, ..., 2021-01-02 12:00:00, ..., 2021-01-15 12:00:00
.
我的试用码:
df['date'] = pd.date_range(datetime(2021, 1, 1, hour=10, minute=2), periods=900, freq='2min')
输出:
DatetimeIndex(['2021-01-01 10:02:00', '2021-01-01 10:04:00',
'2021-01-01 10:06:00', '2021-01-01 10:08:00',
'2021-01-01 10:10:00', '2021-01-01 10:12:00',
'2021-01-01 10:14:00', '2021-01-01 10:16:00',
'2021-01-01 10:18:00', '2021-01-01 10:20:00',
...
'2021-01-02 15:42:00', '2021-01-02 15:44:00',
'2021-01-02 15:46:00', '2021-01-02 15:48:00',
'2021-01-02 15:50:00', '2021-01-02 15:52:00',
'2021-01-02 15:54:00', '2021-01-02 15:56:00',
'2021-01-02 15:58:00', '2021-01-02 16:00:00'],
dtype='datetime64[ns]', length=900, freq='2T')
很明显,不尽如人意,怎么办?谢谢。
df = pd.DataFrame({'Date':pd.date_range(datetime(2021, 1, 1, hour=10, minute=2), periods=900, freq='2min'),
'Val':[i for i in range(900)]})
df = df.set_index('Date')
result = df.between_time('10:00:00', '12:00:00')
result.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 121 entries, 2021-01-01 10:02:00 to 2021-01-02 12:00:00
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Val 121 non-null int64
dtypes: int64(1)
memory usage: 1.9 KB
您可以在 pd.date_range()
and then use .between_time()
中设置 start=
和 end=
参数,而不是指定 periods=
参数,如下所示:
df = pd.DataFrame({'date':pd.date_range(start=datetime(2021, 1, 1, hour=10), end=datetime(2021, 1, 15, hour=12), freq='2min')})
df = df.set_index('date')
date_range = df.between_time('10:02:00', '12:00:00')
print(date_range)
date
2021-01-01 10:02:00
2021-01-01 10:04:00
2021-01-01 10:06:00
2021-01-01 10:08:00
...
2021-01-15 11:52:00
2021-01-15 11:54:00
2021-01-15 11:56:00
2021-01-15 11:58:00
2021-01-15 12:00:00
900 rows × 0 columns
我有一个包含 900
行的数据框,我需要为每一天创建一个从 2021-01-01
到 2021-01-15
的 date
列,从 10:00:00
到12:00:00
,区间为2 minutes
,
预期的结果会是这样的:2021-01-01 10:02:00, 2021-01-01 10:04:00, ..., 2021-01-01 12:00:00, 2021-01-02 10:02:00, ..., 2021-01-02 12:00:00, ..., 2021-01-15 12:00:00
.
我的试用码:
df['date'] = pd.date_range(datetime(2021, 1, 1, hour=10, minute=2), periods=900, freq='2min')
输出:
DatetimeIndex(['2021-01-01 10:02:00', '2021-01-01 10:04:00',
'2021-01-01 10:06:00', '2021-01-01 10:08:00',
'2021-01-01 10:10:00', '2021-01-01 10:12:00',
'2021-01-01 10:14:00', '2021-01-01 10:16:00',
'2021-01-01 10:18:00', '2021-01-01 10:20:00',
...
'2021-01-02 15:42:00', '2021-01-02 15:44:00',
'2021-01-02 15:46:00', '2021-01-02 15:48:00',
'2021-01-02 15:50:00', '2021-01-02 15:52:00',
'2021-01-02 15:54:00', '2021-01-02 15:56:00',
'2021-01-02 15:58:00', '2021-01-02 16:00:00'],
dtype='datetime64[ns]', length=900, freq='2T')
很明显,不尽如人意,怎么办?谢谢。
df = pd.DataFrame({'Date':pd.date_range(datetime(2021, 1, 1, hour=10, minute=2), periods=900, freq='2min'),
'Val':[i for i in range(900)]})
df = df.set_index('Date')
result = df.between_time('10:00:00', '12:00:00')
result.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 121 entries, 2021-01-01 10:02:00 to 2021-01-02 12:00:00
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Val 121 non-null int64
dtypes: int64(1)
memory usage: 1.9 KB
您可以在 pd.date_range()
and then use .between_time()
中设置 start=
和 end=
参数,而不是指定 periods=
参数,如下所示:
df = pd.DataFrame({'date':pd.date_range(start=datetime(2021, 1, 1, hour=10), end=datetime(2021, 1, 15, hour=12), freq='2min')})
df = df.set_index('date')
date_range = df.between_time('10:02:00', '12:00:00')
print(date_range)
date
2021-01-01 10:02:00
2021-01-01 10:04:00
2021-01-01 10:06:00
2021-01-01 10:08:00
...
2021-01-15 11:52:00
2021-01-15 11:54:00
2021-01-15 11:56:00
2021-01-15 11:58:00
2021-01-15 12:00:00
900 rows × 0 columns