从 Pandas 中的文本时间生成 15 分钟的间隔
Generate 15 minute intervals from text time in Pandas
我读入了一个 Excel
文件,其中包含一列 time
值,如下所示:
Time
1 3:00a
2
3 3:30a
4
5 4:00a
6
7 4:30a
8
9 5:00a
10
11 5:30a
12
13 6:00a
14
15 6:30a
16
17 7:00a
18
19 7:30a
20
21 8:00a
22
23 8:30a
24
25 9:00a
26
27 9:30a
28
29 10:00a
30
31 10:30a
32
33 11:00a
34
35 11:30a
36
37 12:00p
38
39 12:30p
40
41 1:00p
42
43 1:30p
44
45 2:00p
46
47 2:30p
48
49 3:00p
50
51 3:30p
52
53 4:00p
54
55 4:30p
56
57 5:00p
58
59 5:30p
60
61 6:00p
62
63 6:30p
64
65 7:00p
66
67 7:30p
68
69 8:00p
70
71 8:30p
72
73 9:00p
74
75 9:30p
76
77 10:00p
78
79 10:30p
80
81 11:00p
82
83 11:30p
84
85 12:00a
86
87 12:30a
88
89 1:00a
90
91 1:30a
92
93 2:00a
94
95 2:30a
添加额外说明:
我还可以从文件名中读取日期,例如 012622
。它是 MMDDYY
格式的字符串。
我希望完成两件事:
将列转换为 pd.datetime
格式
以 15 分钟的间隔填充“空白”,生成一个看起来像
的列
| 3:00:00 |
| 3:15:00 |
| 3:30:00 |
| 3:45:00 |
| 4:00:00 |
| 4:15:00 |
| 4:30:00 |
| 4:45:00 |
| 5:00:00 |
等...
我首先尝试通过引用 来做到这一点,但是:
n = pd.read_clipboard()
new_n = pd.to_timedelta(n+':00')
结果:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Input In [42], in <module>
----> 1 new_n = pd.to_timedelta(n.astype('str')+':00')
File filepath\_venv\lib\site-packages\pandas\core\tools\timedeltas.py:134, in to_timedelta(arg, unit, errors)
132 return _convert_listlike(arg, unit=unit, errors=errors)
133 elif getattr(arg, "ndim", 1) > 1:
--> 134 raise TypeError(
135 "arg must be a string, timedelta, list, tuple, 1-d array, or Series"
136 )
138 if isinstance(arg, str) and unit is not None:
139 raise ValueError("unit must not be specified if the input is/contains a str")
TypeError: arg must be a string, timedelta, list, tuple, 1-d array, or Series
我确实找到了 ,这似乎有助于参考将专栏转换为刻钟格式,但是,它首先需要是 datetime
。
IIUC。您只需重新创建起点。使用 freq
(15 分钟)和 periods
(数据帧的长度),您可以创建 DatetimeIndex
date = '012622' # extract the date from filename here
start = pd.to_datetime(f"{date} {df['Time'].iloc[0]}m", format='%m%d%y %I:%M%p')
df['Time'] = pd.date_range(start, freq='15T', periods=len(df))
输出:
>>> df
Time
1 2022-01-26 03:00:00
2 2022-01-26 03:15:00
3 2022-01-26 03:30:00
4 2022-01-26 03:45:00
5 2022-01-26 04:00:00
.. ...
91 2022-01-27 01:30:00
92 2022-01-27 01:45:00
93 2022-01-27 02:00:00
94 2022-01-27 02:15:00
95 2022-01-27 02:30:00
[95 rows x 1 columns]
我读入了一个 Excel
文件,其中包含一列 time
值,如下所示:
Time
1 3:00a
2
3 3:30a
4
5 4:00a
6
7 4:30a
8
9 5:00a
10
11 5:30a
12
13 6:00a
14
15 6:30a
16
17 7:00a
18
19 7:30a
20
21 8:00a
22
23 8:30a
24
25 9:00a
26
27 9:30a
28
29 10:00a
30
31 10:30a
32
33 11:00a
34
35 11:30a
36
37 12:00p
38
39 12:30p
40
41 1:00p
42
43 1:30p
44
45 2:00p
46
47 2:30p
48
49 3:00p
50
51 3:30p
52
53 4:00p
54
55 4:30p
56
57 5:00p
58
59 5:30p
60
61 6:00p
62
63 6:30p
64
65 7:00p
66
67 7:30p
68
69 8:00p
70
71 8:30p
72
73 9:00p
74
75 9:30p
76
77 10:00p
78
79 10:30p
80
81 11:00p
82
83 11:30p
84
85 12:00a
86
87 12:30a
88
89 1:00a
90
91 1:30a
92
93 2:00a
94
95 2:30a
添加额外说明:
我还可以从文件名中读取日期,例如 012622
。它是 MMDDYY
格式的字符串。
我希望完成两件事:
将列转换为
pd.datetime
格式以 15 分钟的间隔填充“空白”,生成一个看起来像
的列| 3:00:00 | | 3:15:00 | | 3:30:00 | | 3:45:00 | | 4:00:00 | | 4:15:00 | | 4:30:00 | | 4:45:00 | | 5:00:00 |
等...
我首先尝试通过引用
n = pd.read_clipboard()
new_n = pd.to_timedelta(n+':00')
结果:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Input In [42], in <module>
----> 1 new_n = pd.to_timedelta(n.astype('str')+':00')
File filepath\_venv\lib\site-packages\pandas\core\tools\timedeltas.py:134, in to_timedelta(arg, unit, errors)
132 return _convert_listlike(arg, unit=unit, errors=errors)
133 elif getattr(arg, "ndim", 1) > 1:
--> 134 raise TypeError(
135 "arg must be a string, timedelta, list, tuple, 1-d array, or Series"
136 )
138 if isinstance(arg, str) and unit is not None:
139 raise ValueError("unit must not be specified if the input is/contains a str")
TypeError: arg must be a string, timedelta, list, tuple, 1-d array, or Series
我确实找到了 datetime
。
IIUC。您只需重新创建起点。使用 freq
(15 分钟)和 periods
(数据帧的长度),您可以创建 DatetimeIndex
date = '012622' # extract the date from filename here
start = pd.to_datetime(f"{date} {df['Time'].iloc[0]}m", format='%m%d%y %I:%M%p')
df['Time'] = pd.date_range(start, freq='15T', periods=len(df))
输出:
>>> df
Time
1 2022-01-26 03:00:00
2 2022-01-26 03:15:00
3 2022-01-26 03:30:00
4 2022-01-26 03:45:00
5 2022-01-26 04:00:00
.. ...
91 2022-01-27 01:30:00
92 2022-01-27 01:45:00
93 2022-01-27 02:00:00
94 2022-01-27 02:15:00
95 2022-01-27 02:30:00
[95 rows x 1 columns]