将字符串持续时间列转换为秒
Convert string duration column to seconds
在数据框中,其中一列是持续时间。它是作为字符串给出的。
index duration
1 1 hour, 2 minutes, 21 seconds
2 1 hour, 2 minutes, 26 seconds
3 1 hour, 2 minutes, 41 seconds
4 1 hour, 4 minutes, 39 seconds
5 1 hour, 42 seconds
6 6 minutes, 7 seconds
7 9 minutes, 7 seconds
8 9 minutes, 9 seconds
9 9 minutes, 9 seconds
10 9 minutes, 9 seconds
如何将此列转换为秒数?
使用pd.Timedelta
解析每一项:
df['duration'] = df['duration'].apply(pd.Timedelta).dt.total_seconds().astype(int)
输出:
>>> df
duration
0 3741
1 3746
2 3761
3 3879
4 3642
5 367
6 547
7 549
8 549
9 549
这是一种解决方案:
import pandas as pd
def convert_str_to_seconds(string) -> int:
l = string.split(', ')
total_seconds = 0
for i in l:
num_value = int(i.split(' ')[0])
if 'hour' in i:
total_seconds += num_value * 3600
elif 'minute' in i:
total_seconds += num_value * 60
elif 'second' in i:
total_seconds += num_value
return total_seconds
df = pd.DataFrame()
l = ['1 hour, 2 minutes, 21 seconds',
'1 hour, 2 minutes, 26 seconds',
'1 hour, 2 minutes, 41 seconds',
'9 minutes, 7 seconds',
'31 seconds']
df['duration'] = l
print(df)
df['duration'] = [convert_str_to_seconds(cell) for cell in df['duration']]
print(df)
输出:
duration
0 1 hour, 2 minutes, 21 seconds
1 1 hour, 2 minutes, 26 seconds
2 1 hour, 2 minutes, 41 seconds
3 9 minutes, 7 seconds
4 31 seconds
duration
0 3741
1 3746
2 3761
3 547
4 31
在数据框中,其中一列是持续时间。它是作为字符串给出的。
index duration
1 1 hour, 2 minutes, 21 seconds
2 1 hour, 2 minutes, 26 seconds
3 1 hour, 2 minutes, 41 seconds
4 1 hour, 4 minutes, 39 seconds
5 1 hour, 42 seconds
6 6 minutes, 7 seconds
7 9 minutes, 7 seconds
8 9 minutes, 9 seconds
9 9 minutes, 9 seconds
10 9 minutes, 9 seconds
如何将此列转换为秒数?
使用pd.Timedelta
解析每一项:
df['duration'] = df['duration'].apply(pd.Timedelta).dt.total_seconds().astype(int)
输出:
>>> df
duration
0 3741
1 3746
2 3761
3 3879
4 3642
5 367
6 547
7 549
8 549
9 549
这是一种解决方案:
import pandas as pd
def convert_str_to_seconds(string) -> int:
l = string.split(', ')
total_seconds = 0
for i in l:
num_value = int(i.split(' ')[0])
if 'hour' in i:
total_seconds += num_value * 3600
elif 'minute' in i:
total_seconds += num_value * 60
elif 'second' in i:
total_seconds += num_value
return total_seconds
df = pd.DataFrame()
l = ['1 hour, 2 minutes, 21 seconds',
'1 hour, 2 minutes, 26 seconds',
'1 hour, 2 minutes, 41 seconds',
'9 minutes, 7 seconds',
'31 seconds']
df['duration'] = l
print(df)
df['duration'] = [convert_str_to_seconds(cell) for cell in df['duration']]
print(df)
输出:
duration
0 1 hour, 2 minutes, 21 seconds
1 1 hour, 2 minutes, 26 seconds
2 1 hour, 2 minutes, 41 seconds
3 9 minutes, 7 seconds
4 31 seconds
duration
0 3741
1 3746
2 3761
3 547
4 31