从 Pandas 中的文本时间生成 15 分钟的间隔

Generate 15 minute intervals from text time in Pandas

我读入了一个 Excel 文件,其中包含一列 time 值,如下所示:

    Time
1   3:00a
2   
3   3:30a
4   
5   4:00a
6   
7   4:30a
8   
9   5:00a
10  
11  5:30a
12  
13  6:00a
14  
15  6:30a
16  
17  7:00a
18  
19  7:30a
20  
21  8:00a
22  
23  8:30a
24  
25  9:00a
26  
27  9:30a
28  
29  10:00a
30  
31  10:30a
32  
33  11:00a
34  
35  11:30a
36  
37  12:00p
38  
39  12:30p
40  
41  1:00p
42  
43  1:30p
44  
45  2:00p
46  
47  2:30p
48  
49  3:00p
50  
51  3:30p
52  
53  4:00p
54  
55  4:30p
56  
57  5:00p
58  
59  5:30p
60  
61  6:00p
62  
63  6:30p
64  
65  7:00p
66  
67  7:30p
68  
69  8:00p
70  
71  8:30p
72  
73  9:00p
74  
75  9:30p
76  
77  10:00p
78  
79  10:30p
80  
81  11:00p
82  
83  11:30p
84  
85  12:00a
86  
87  12:30a
88  
89  1:00a
90  
91  1:30a
92  
93  2:00a
94  
95  2:30a

添加额外说明:

我还可以从文件名中读取日期,例如 012622。它是 MMDDYY 格式的字符串。

我希望完成两件事:

  1. 将列转换为 pd.datetime 格式

  2. 以 15 分钟的间隔填充“空白”,生成一个看起来像

    的列

    | 3:00:00 | | 3:15:00 | | 3:30:00 | | 3:45:00 | | 4:00:00 | | 4:15:00 | | 4:30:00 | | 4:45:00 | | 5:00:00 |

等...

我首先尝试通过引用 来做到这一点,但是:

n = pd.read_clipboard()
new_n = pd.to_timedelta(n+':00')

结果:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [42], in <module>
----> 1 new_n = pd.to_timedelta(n.astype('str')+':00')

File filepath\_venv\lib\site-packages\pandas\core\tools\timedeltas.py:134, in to_timedelta(arg, unit, errors)
    132     return _convert_listlike(arg, unit=unit, errors=errors)
    133 elif getattr(arg, "ndim", 1) > 1:
--> 134     raise TypeError(
    135         "arg must be a string, timedelta, list, tuple, 1-d array, or Series"
    136     )
    138 if isinstance(arg, str) and unit is not None:
    139     raise ValueError("unit must not be specified if the input is/contains a str")

TypeError: arg must be a string, timedelta, list, tuple, 1-d array, or Series

我确实找到了 ,这似乎有助于参考将专栏转换为刻钟格式,但是,它首先需要是 datetime

IIUC。您只需重新创建起点。使用 freq(15 分钟)和 periods(数据帧的长度),您可以创建 DatetimeIndex

date = '012622'  # extract the date from filename here
start = pd.to_datetime(f"{date} {df['Time'].iloc[0]}m", format='%m%d%y %I:%M%p')
df['Time'] = pd.date_range(start, freq='15T', periods=len(df))

输出:

>>> df
                  Time
1  2022-01-26 03:00:00
2  2022-01-26 03:15:00
3  2022-01-26 03:30:00
4  2022-01-26 03:45:00
5  2022-01-26 04:00:00
..                 ...
91 2022-01-27 01:30:00
92 2022-01-27 01:45:00
93 2022-01-27 02:00:00
94 2022-01-27 02:15:00
95 2022-01-27 02:30:00

[95 rows x 1 columns]