将字符串持续时间列转换为秒

Convert string duration column to seconds

在数据框中,其中一列是持续时间。它是作为字符串给出的。

index         duration
1     1 hour, 2 minutes, 21 seconds
2     1 hour, 2 minutes, 26 seconds
3     1 hour, 2 minutes, 41 seconds
4     1 hour, 4 minutes, 39 seconds
5                1 hour, 42 seconds
6              6 minutes, 7 seconds
7              9 minutes, 7 seconds
8              9 minutes, 9 seconds
9              9 minutes, 9 seconds
10             9 minutes, 9 seconds

如何将此列转换为秒数?

使用pd.Timedelta解析每一项:

df['duration'] = df['duration'].apply(pd.Timedelta).dt.total_seconds().astype(int)

输出:

>>> df
   duration
0      3741
1      3746
2      3761
3      3879
4      3642
5       367
6       547
7       549
8       549
9       549

这是一种解决方案:

import pandas as pd

def convert_str_to_seconds(string) -> int:
    l = string.split(', ') 
    total_seconds = 0
    for i in l:
        num_value = int(i.split(' ')[0])
        if 'hour' in i:
            total_seconds += num_value * 3600
        elif 'minute' in i:
            total_seconds += num_value * 60
        elif 'second' in i:
            total_seconds += num_value
    return total_seconds

df = pd.DataFrame()
l = ['1 hour, 2 minutes, 21 seconds', 
     '1 hour, 2 minutes, 26 seconds',
     '1 hour, 2 minutes, 41 seconds',
     '9 minutes, 7 seconds',
     '31 seconds']


df['duration'] = l
print(df)
df['duration'] = [convert_str_to_seconds(cell) for cell in df['duration']]
print(df)

输出:

                        duration
0  1 hour, 2 minutes, 21 seconds
1  1 hour, 2 minutes, 26 seconds
2  1 hour, 2 minutes, 41 seconds
3           9 minutes, 7 seconds
4                     31 seconds
   duration
0      3741
1      3746
2      3761
3       547
4        31