如何使用 Python、Pandas、数据框将 Day.Hours[24hr]:MM:SS 转换为秒数?

How to convert Day.Hours[24hr]:MM:SS into Seconds using Python, Pandas, Data frame?

我想将数据帧的时间序列转换为经过的时间。 格式为 Day.Hours[24hr]:MM:SS - 这不是典型的 DD:HH:MM:SS,我遇到了障碍。我在下面添加了几行时间序列。从 0day 0hrs 0mins 1.1secs 开始。结束于第 1 天 17 小时 40 分 1.1 秒。我想要的输出是一个包含 1.1、10001.1、20001.1、...

的数据框列

任何想法都很好(有时采样率不准确,所以我想操纵 time_data 列以允许任何更改)?

谢谢大家!

import pandas as pd

time_data = pd.DataFrame({'Time':[
'0.00:00:01.1',
'0.02:46:41.1',
'0.05:33:21.1',
'0.08:20:01.1',
'0.11:06:41.1',
'0.13:53:21.1',
'0.16:40:01.1',
'0.19:26:41.1',
'0.22:13:21.1',
'1.01:00:01.1',
'1.03:46:41.1',
'1.06:33:21.1',
'1.09:20:01.1',
'1.12:06:41.1',
'1.14:53:21.1',
'1.17:40:01.1']})
#Get the days part
time_data['days'] = time_data['Time'].str.split('.').str[0].astype('int')
#Get the time part
time_data['time'] = pd.to_datetime(time_data['Time'].str.split('.').str[1])
#First step in between
time_data['elapsed1'] = time_data['time'] + time_data['days']*pd.Timedelta(24,'H')
#Final step
time_data['elapsed2'] = (time_data['elapsed1'] -  time_data['elapsed1'].iloc[0]).dt.total_seconds()

这为 DataFrame 提供:

            Time  days                time            elapsed1  elapsed2
0   0.00:00:01.1     0 2022-02-06 00:00:01 2022-02-06 00:00:01       0.0
1   0.02:46:41.1     0 2022-02-06 02:46:41 2022-02-06 02:46:41   10000.0
2   0.05:33:21.1     0 2022-02-06 05:33:21 2022-02-06 05:33:21   20000.0
3   0.08:20:01.1     0 2022-02-06 08:20:01 2022-02-06 08:20:01   30000.0
4   0.11:06:41.1     0 2022-02-06 11:06:41 2022-02-06 11:06:41   40000.0
5   0.13:53:21.1     0 2022-02-06 13:53:21 2022-02-06 13:53:21   50000.0
6   0.16:40:01.1     0 2022-02-06 16:40:01 2022-02-06 16:40:01   60000.0
7   0.19:26:41.1     0 2022-02-06 19:26:41 2022-02-06 19:26:41   70000.0
8   0.22:13:21.1     0 2022-02-06 22:13:21 2022-02-06 22:13:21   80000.0
9   1.01:00:01.1     1 2022-02-06 01:00:01 2022-02-07 01:00:01   90000.0
10  1.03:46:41.1     1 2022-02-06 03:46:41 2022-02-07 03:46:41  100000.0
11  1.06:33:21.1     1 2022-02-06 06:33:21 2022-02-07 06:33:21  110000.0
12  1.09:20:01.1     1 2022-02-06 09:20:01 2022-02-07 09:20:01  120000.0
13  1.12:06:41.1     1 2022-02-06 12:06:41 2022-02-07 12:06:41  130000.0
14  1.14:53:21.1     1 2022-02-06 14:53:21 2022-02-07 14:53:21  140000.0
15  1.17:40:01.1     1 2022-02-06 17:40:01 2022-02-07 17:40:01  150000.0

它不是 1.1 等,如您所愿,但是向 elapsed2 列添加额外偏移量应该是一个简单的修复。

只需将您的值转换为 Timedelta 表示格式:

timedelta = df['Time'].str.replace('\.', ' days ', n=1, regex=True)

df['Seconds'] = pd.to_timedelta(timedelta).dt.total_seconds()

输出:

>>> df
            Time   Seconds
0   0.00:00:01.1       1.1
1   0.02:46:41.1   10001.1
2   0.05:33:21.1   20001.1
3   0.08:20:01.1   30001.1
4   0.11:06:41.1   40001.1
5   0.13:53:21.1   50001.1
6   0.16:40:01.1   60001.1
7   0.19:26:41.1   70001.1
8   0.22:13:21.1   80001.1
9   1.01:00:01.1   90001.1
10  1.03:46:41.1  100001.1
11  1.06:33:21.1  110001.1
12  1.09:20:01.1  120001.1
13  1.12:06:41.1  130001.1
14  1.14:53:21.1  140001.1
15  1.17:40:01.1  150001.1

关注Timedelta:

>>> timedelta
0     0 days 00:00:01.1
1     0 days 02:46:41.1
2     0 days 05:33:21.1
3     0 days 08:20:01.1
4     0 days 11:06:41.1
5     0 days 13:53:21.1
6     0 days 16:40:01.1
7     0 days 19:26:41.1
8     0 days 22:13:21.1
9     1 days 01:00:01.1
10    1 days 03:46:41.1
11    1 days 06:33:21.1
12    1 days 09:20:01.1
13    1 days 12:06:41.1
14    1 days 14:53:21.1
15    1 days 17:40:01.1
Name: Time, dtype: object

要计算行之间的差异,请使用:

>>> pd.to_timedelta(timedelta).diff().dt.total_seconds()
0         NaN
1     10000.0
2     10000.0
3     10000.0
4     10000.0
5     10000.0
6     10000.0
7     10000.0
8     10000.0
9     10000.0
10    10000.0
11    10000.0
12    10000.0
13    10000.0
14    10000.0
15    10000.0
Name: Time, dtype: float64