如何使用 Python、Pandas、数据框将 Day.Hours[24hr]:MM:SS 转换为秒数?
How to convert Day.Hours[24hr]:MM:SS into Seconds using Python, Pandas, Data frame?
我想将数据帧的时间序列转换为经过的时间。
格式为 Day.Hours[24hr]:MM:SS - 这不是典型的 DD:HH:MM:SS,我遇到了障碍。我在下面添加了几行时间序列。从 0day 0hrs 0mins 1.1secs 开始。结束于第 1 天 17 小时 40 分 1.1 秒。我想要的输出是一个包含 1.1、10001.1、20001.1、...
的数据框列
任何想法都很好(有时采样率不准确,所以我想操纵 time_data 列以允许任何更改)?
谢谢大家!
import pandas as pd
time_data = pd.DataFrame({'Time':[
'0.00:00:01.1',
'0.02:46:41.1',
'0.05:33:21.1',
'0.08:20:01.1',
'0.11:06:41.1',
'0.13:53:21.1',
'0.16:40:01.1',
'0.19:26:41.1',
'0.22:13:21.1',
'1.01:00:01.1',
'1.03:46:41.1',
'1.06:33:21.1',
'1.09:20:01.1',
'1.12:06:41.1',
'1.14:53:21.1',
'1.17:40:01.1']})
#Get the days part
time_data['days'] = time_data['Time'].str.split('.').str[0].astype('int')
#Get the time part
time_data['time'] = pd.to_datetime(time_data['Time'].str.split('.').str[1])
#First step in between
time_data['elapsed1'] = time_data['time'] + time_data['days']*pd.Timedelta(24,'H')
#Final step
time_data['elapsed2'] = (time_data['elapsed1'] - time_data['elapsed1'].iloc[0]).dt.total_seconds()
这为 DataFrame 提供:
Time days time elapsed1 elapsed2
0 0.00:00:01.1 0 2022-02-06 00:00:01 2022-02-06 00:00:01 0.0
1 0.02:46:41.1 0 2022-02-06 02:46:41 2022-02-06 02:46:41 10000.0
2 0.05:33:21.1 0 2022-02-06 05:33:21 2022-02-06 05:33:21 20000.0
3 0.08:20:01.1 0 2022-02-06 08:20:01 2022-02-06 08:20:01 30000.0
4 0.11:06:41.1 0 2022-02-06 11:06:41 2022-02-06 11:06:41 40000.0
5 0.13:53:21.1 0 2022-02-06 13:53:21 2022-02-06 13:53:21 50000.0
6 0.16:40:01.1 0 2022-02-06 16:40:01 2022-02-06 16:40:01 60000.0
7 0.19:26:41.1 0 2022-02-06 19:26:41 2022-02-06 19:26:41 70000.0
8 0.22:13:21.1 0 2022-02-06 22:13:21 2022-02-06 22:13:21 80000.0
9 1.01:00:01.1 1 2022-02-06 01:00:01 2022-02-07 01:00:01 90000.0
10 1.03:46:41.1 1 2022-02-06 03:46:41 2022-02-07 03:46:41 100000.0
11 1.06:33:21.1 1 2022-02-06 06:33:21 2022-02-07 06:33:21 110000.0
12 1.09:20:01.1 1 2022-02-06 09:20:01 2022-02-07 09:20:01 120000.0
13 1.12:06:41.1 1 2022-02-06 12:06:41 2022-02-07 12:06:41 130000.0
14 1.14:53:21.1 1 2022-02-06 14:53:21 2022-02-07 14:53:21 140000.0
15 1.17:40:01.1 1 2022-02-06 17:40:01 2022-02-07 17:40:01 150000.0
它不是 1.1 等,如您所愿,但是向 elapsed2
列添加额外偏移量应该是一个简单的修复。
只需将您的值转换为 Timedelta
表示格式:
timedelta = df['Time'].str.replace('\.', ' days ', n=1, regex=True)
df['Seconds'] = pd.to_timedelta(timedelta).dt.total_seconds()
输出:
>>> df
Time Seconds
0 0.00:00:01.1 1.1
1 0.02:46:41.1 10001.1
2 0.05:33:21.1 20001.1
3 0.08:20:01.1 30001.1
4 0.11:06:41.1 40001.1
5 0.13:53:21.1 50001.1
6 0.16:40:01.1 60001.1
7 0.19:26:41.1 70001.1
8 0.22:13:21.1 80001.1
9 1.01:00:01.1 90001.1
10 1.03:46:41.1 100001.1
11 1.06:33:21.1 110001.1
12 1.09:20:01.1 120001.1
13 1.12:06:41.1 130001.1
14 1.14:53:21.1 140001.1
15 1.17:40:01.1 150001.1
关注Timedelta:
>>> timedelta
0 0 days 00:00:01.1
1 0 days 02:46:41.1
2 0 days 05:33:21.1
3 0 days 08:20:01.1
4 0 days 11:06:41.1
5 0 days 13:53:21.1
6 0 days 16:40:01.1
7 0 days 19:26:41.1
8 0 days 22:13:21.1
9 1 days 01:00:01.1
10 1 days 03:46:41.1
11 1 days 06:33:21.1
12 1 days 09:20:01.1
13 1 days 12:06:41.1
14 1 days 14:53:21.1
15 1 days 17:40:01.1
Name: Time, dtype: object
要计算行之间的差异,请使用:
>>> pd.to_timedelta(timedelta).diff().dt.total_seconds()
0 NaN
1 10000.0
2 10000.0
3 10000.0
4 10000.0
5 10000.0
6 10000.0
7 10000.0
8 10000.0
9 10000.0
10 10000.0
11 10000.0
12 10000.0
13 10000.0
14 10000.0
15 10000.0
Name: Time, dtype: float64
我想将数据帧的时间序列转换为经过的时间。 格式为 Day.Hours[24hr]:MM:SS - 这不是典型的 DD:HH:MM:SS,我遇到了障碍。我在下面添加了几行时间序列。从 0day 0hrs 0mins 1.1secs 开始。结束于第 1 天 17 小时 40 分 1.1 秒。我想要的输出是一个包含 1.1、10001.1、20001.1、...
的数据框列任何想法都很好(有时采样率不准确,所以我想操纵 time_data 列以允许任何更改)?
谢谢大家!
import pandas as pd
time_data = pd.DataFrame({'Time':[
'0.00:00:01.1',
'0.02:46:41.1',
'0.05:33:21.1',
'0.08:20:01.1',
'0.11:06:41.1',
'0.13:53:21.1',
'0.16:40:01.1',
'0.19:26:41.1',
'0.22:13:21.1',
'1.01:00:01.1',
'1.03:46:41.1',
'1.06:33:21.1',
'1.09:20:01.1',
'1.12:06:41.1',
'1.14:53:21.1',
'1.17:40:01.1']})
#Get the days part
time_data['days'] = time_data['Time'].str.split('.').str[0].astype('int')
#Get the time part
time_data['time'] = pd.to_datetime(time_data['Time'].str.split('.').str[1])
#First step in between
time_data['elapsed1'] = time_data['time'] + time_data['days']*pd.Timedelta(24,'H')
#Final step
time_data['elapsed2'] = (time_data['elapsed1'] - time_data['elapsed1'].iloc[0]).dt.total_seconds()
这为 DataFrame 提供:
Time days time elapsed1 elapsed2
0 0.00:00:01.1 0 2022-02-06 00:00:01 2022-02-06 00:00:01 0.0
1 0.02:46:41.1 0 2022-02-06 02:46:41 2022-02-06 02:46:41 10000.0
2 0.05:33:21.1 0 2022-02-06 05:33:21 2022-02-06 05:33:21 20000.0
3 0.08:20:01.1 0 2022-02-06 08:20:01 2022-02-06 08:20:01 30000.0
4 0.11:06:41.1 0 2022-02-06 11:06:41 2022-02-06 11:06:41 40000.0
5 0.13:53:21.1 0 2022-02-06 13:53:21 2022-02-06 13:53:21 50000.0
6 0.16:40:01.1 0 2022-02-06 16:40:01 2022-02-06 16:40:01 60000.0
7 0.19:26:41.1 0 2022-02-06 19:26:41 2022-02-06 19:26:41 70000.0
8 0.22:13:21.1 0 2022-02-06 22:13:21 2022-02-06 22:13:21 80000.0
9 1.01:00:01.1 1 2022-02-06 01:00:01 2022-02-07 01:00:01 90000.0
10 1.03:46:41.1 1 2022-02-06 03:46:41 2022-02-07 03:46:41 100000.0
11 1.06:33:21.1 1 2022-02-06 06:33:21 2022-02-07 06:33:21 110000.0
12 1.09:20:01.1 1 2022-02-06 09:20:01 2022-02-07 09:20:01 120000.0
13 1.12:06:41.1 1 2022-02-06 12:06:41 2022-02-07 12:06:41 130000.0
14 1.14:53:21.1 1 2022-02-06 14:53:21 2022-02-07 14:53:21 140000.0
15 1.17:40:01.1 1 2022-02-06 17:40:01 2022-02-07 17:40:01 150000.0
它不是 1.1 等,如您所愿,但是向 elapsed2
列添加额外偏移量应该是一个简单的修复。
只需将您的值转换为 Timedelta
表示格式:
timedelta = df['Time'].str.replace('\.', ' days ', n=1, regex=True)
df['Seconds'] = pd.to_timedelta(timedelta).dt.total_seconds()
输出:
>>> df
Time Seconds
0 0.00:00:01.1 1.1
1 0.02:46:41.1 10001.1
2 0.05:33:21.1 20001.1
3 0.08:20:01.1 30001.1
4 0.11:06:41.1 40001.1
5 0.13:53:21.1 50001.1
6 0.16:40:01.1 60001.1
7 0.19:26:41.1 70001.1
8 0.22:13:21.1 80001.1
9 1.01:00:01.1 90001.1
10 1.03:46:41.1 100001.1
11 1.06:33:21.1 110001.1
12 1.09:20:01.1 120001.1
13 1.12:06:41.1 130001.1
14 1.14:53:21.1 140001.1
15 1.17:40:01.1 150001.1
关注Timedelta:
>>> timedelta
0 0 days 00:00:01.1
1 0 days 02:46:41.1
2 0 days 05:33:21.1
3 0 days 08:20:01.1
4 0 days 11:06:41.1
5 0 days 13:53:21.1
6 0 days 16:40:01.1
7 0 days 19:26:41.1
8 0 days 22:13:21.1
9 1 days 01:00:01.1
10 1 days 03:46:41.1
11 1 days 06:33:21.1
12 1 days 09:20:01.1
13 1 days 12:06:41.1
14 1 days 14:53:21.1
15 1 days 17:40:01.1
Name: Time, dtype: object
要计算行之间的差异,请使用:
>>> pd.to_timedelta(timedelta).diff().dt.total_seconds()
0 NaN
1 10000.0
2 10000.0
3 10000.0
4 10000.0
5 10000.0
6 10000.0
7 10000.0
8 10000.0
9 10000.0
10 10000.0
11 10000.0
12 10000.0
13 10000.0
14 10000.0
15 10000.0
Name: Time, dtype: float64