仅保留时间增量中的 hh:mm:ss

Keeping just the hh:mm:ss from a time delta

我有一列时间增量,其中列出了属性 here。我希望 pandas table 中的输出来自:

1 day, 13:54:03.0456

至:

13:54:03

如何从此输出中删除日期?

您可以使用 dt.seconds 获取当天的秒数,然后将其传递给 pd.Timedelta:

from pandas import Series, date_range
from datetime import timedelta
td = Series(date_range('20130101',periods=4)) - Series(date_range('20121201',periods=4))
td[2] += timedelta(minutes=5,seconds=3)

In [321]: td
Out[321]: 
0   31 days 00:00:00
1   31 days 00:00:00
2   31 days 00:05:03
3   31 days 00:00:00
dtype: timedelta64[ns]

In [322]: td.dt.seconds.apply(lambda x: pd.Timedelta(seconds=x))
Out[322]: 
0   00:00:00
1   00:00:00
2   00:05:03
3   00:00:00
dtype: timedelta64[ns]

您可以从每个 Timedelta 中减去天数:

import numpy as np
import pandas as pd

df = pd.DataFrame({'A':pd.to_timedelta(np.linspace(0, 10**6, 10), unit='s')})
df.iloc[::3, 0] = pd.NaT
df['B'] = df['A'] - df['A'].values.astype('timedelta64[D]')
# truncate fractional seconds
df['truncated'] = df['B'].values.astype('timedelta64[s]')
# round to nearest second
df['rounded'] = np.asarray(np.round(df['B'].values / np.timedelta64(1, 's')), dtype='timedelta64[s]')
print(df)

产量

                        A               B  truncated  rounded
0                     NaT             NaT        NaT      NaT
1  1 days 06:51:51.111111 06:51:51.111111   06:51:51 06:51:51
2  2 days 13:43:42.222222 13:43:42.222222   13:43:42 13:43:42
3                     NaT             NaT        NaT      NaT
4  5 days 03:27:24.444444 03:27:24.444444   03:27:24 03:27:24
5  6 days 10:19:15.555556 10:19:15.555556   10:19:15 10:19:16
6                     NaT             NaT        NaT      NaT
7  9 days 00:02:57.777778 00:02:57.777778   00:02:57 00:02:58
8 10 days 06:54:48.888889 06:54:48.888889   06:54:48 06:54:49
9                     NaT             NaT        NaT      NaT

A 列显示原始 Timedelta。 B 列显示减去所有天数后的结果。 truncatedrounded 列显示舍去或舍入小数秒后的结果。

调用 astype('timedelta64[D]') 将 NumPy timedelta64s 截断为整天。 同样,调用 astype('timedelta64[s]') 会将 NumPy timedelta64s 截断为整秒。有关 datetime64/timedelta64 算术的更多信息,请参阅 the NumPy docs


另一种减去天数的方法是使用:

df['B'] = df['A'] - pd.to_timedelta(df['A'].dt.days, unit='d')

但事实证明这样比较慢:

In [72]: df = pd.DataFrame({'A':pd.to_timedelta(np.linspace(0, 10**6, 1000), unit='s')})

In [73]: %timeit df['A'] - df['A'].values.astype('timedelta64[D]')
1000 loops, best of 3: 729 µs per loop

In [74]: %timeit df['A'] - pd.to_timedelta(df['A'].dt.days, unit='d')
100 loops, best of 3: 12.6 ms per loop

另一种舍入到最接近秒数的方法是:

df['rounded'] = pd.to_timedelta(df['B'].dt.total_seconds().round(), unit='s')

但是这又慢了:

In [104]: df = pd.DataFrame({'A':pd.to_timedelta(np.linspace(0, 10**6, 1000), unit='s')})

In [105]: df['B'] = df['A'] - df['A'].values.astype('timedelta64[D]')

In [106]: %timeit np.asarray(np.round(df['B'].values / np.timedelta64(1, 's')), dtype='timedelta64[s]')
10000 loops, best of 3: 27.7 µs per loop

In [107]: %timeit pd.to_timedelta(df['B'].dt.total_seconds().round(), unit='s')
100 loops, best of 3: 3.94 ms per loop