仅保留时间增量中的 hh:mm:ss
Keeping just the hh:mm:ss from a time delta
我有一列时间增量,其中列出了属性 here。我希望 pandas table 中的输出来自:
1 day, 13:54:03.0456
至:
13:54:03
如何从此输出中删除日期?
您可以使用 dt.seconds
获取当天的秒数,然后将其传递给 pd.Timedelta
:
from pandas import Series, date_range
from datetime import timedelta
td = Series(date_range('20130101',periods=4)) - Series(date_range('20121201',periods=4))
td[2] += timedelta(minutes=5,seconds=3)
In [321]: td
Out[321]:
0 31 days 00:00:00
1 31 days 00:00:00
2 31 days 00:05:03
3 31 days 00:00:00
dtype: timedelta64[ns]
In [322]: td.dt.seconds.apply(lambda x: pd.Timedelta(seconds=x))
Out[322]:
0 00:00:00
1 00:00:00
2 00:05:03
3 00:00:00
dtype: timedelta64[ns]
您可以从每个 Timedelta 中减去天数:
import numpy as np
import pandas as pd
df = pd.DataFrame({'A':pd.to_timedelta(np.linspace(0, 10**6, 10), unit='s')})
df.iloc[::3, 0] = pd.NaT
df['B'] = df['A'] - df['A'].values.astype('timedelta64[D]')
# truncate fractional seconds
df['truncated'] = df['B'].values.astype('timedelta64[s]')
# round to nearest second
df['rounded'] = np.asarray(np.round(df['B'].values / np.timedelta64(1, 's')), dtype='timedelta64[s]')
print(df)
产量
A B truncated rounded
0 NaT NaT NaT NaT
1 1 days 06:51:51.111111 06:51:51.111111 06:51:51 06:51:51
2 2 days 13:43:42.222222 13:43:42.222222 13:43:42 13:43:42
3 NaT NaT NaT NaT
4 5 days 03:27:24.444444 03:27:24.444444 03:27:24 03:27:24
5 6 days 10:19:15.555556 10:19:15.555556 10:19:15 10:19:16
6 NaT NaT NaT NaT
7 9 days 00:02:57.777778 00:02:57.777778 00:02:57 00:02:58
8 10 days 06:54:48.888889 06:54:48.888889 06:54:48 06:54:49
9 NaT NaT NaT NaT
第 A
列显示原始 Timedelta。 B
列显示减去所有天数后的结果。 truncated
和 rounded
列显示舍去或舍入小数秒后的结果。
调用 astype('timedelta64[D]')
将 NumPy timedelta64s 截断为整天。
同样,调用 astype('timedelta64[s]')
会将 NumPy timedelta64s 截断为整秒。有关 datetime64/timedelta64 算术的更多信息,请参阅 the NumPy docs。
另一种减去天数的方法是使用:
df['B'] = df['A'] - pd.to_timedelta(df['A'].dt.days, unit='d')
但事实证明这样比较慢:
In [72]: df = pd.DataFrame({'A':pd.to_timedelta(np.linspace(0, 10**6, 1000), unit='s')})
In [73]: %timeit df['A'] - df['A'].values.astype('timedelta64[D]')
1000 loops, best of 3: 729 µs per loop
In [74]: %timeit df['A'] - pd.to_timedelta(df['A'].dt.days, unit='d')
100 loops, best of 3: 12.6 ms per loop
另一种舍入到最接近秒数的方法是:
df['rounded'] = pd.to_timedelta(df['B'].dt.total_seconds().round(), unit='s')
但是这又慢了:
In [104]: df = pd.DataFrame({'A':pd.to_timedelta(np.linspace(0, 10**6, 1000), unit='s')})
In [105]: df['B'] = df['A'] - df['A'].values.astype('timedelta64[D]')
In [106]: %timeit np.asarray(np.round(df['B'].values / np.timedelta64(1, 's')), dtype='timedelta64[s]')
10000 loops, best of 3: 27.7 µs per loop
In [107]: %timeit pd.to_timedelta(df['B'].dt.total_seconds().round(), unit='s')
100 loops, best of 3: 3.94 ms per loop
我有一列时间增量,其中列出了属性 here。我希望 pandas table 中的输出来自:
1 day, 13:54:03.0456
至:
13:54:03
如何从此输出中删除日期?
您可以使用 dt.seconds
获取当天的秒数,然后将其传递给 pd.Timedelta
:
from pandas import Series, date_range
from datetime import timedelta
td = Series(date_range('20130101',periods=4)) - Series(date_range('20121201',periods=4))
td[2] += timedelta(minutes=5,seconds=3)
In [321]: td
Out[321]:
0 31 days 00:00:00
1 31 days 00:00:00
2 31 days 00:05:03
3 31 days 00:00:00
dtype: timedelta64[ns]
In [322]: td.dt.seconds.apply(lambda x: pd.Timedelta(seconds=x))
Out[322]:
0 00:00:00
1 00:00:00
2 00:05:03
3 00:00:00
dtype: timedelta64[ns]
您可以从每个 Timedelta 中减去天数:
import numpy as np
import pandas as pd
df = pd.DataFrame({'A':pd.to_timedelta(np.linspace(0, 10**6, 10), unit='s')})
df.iloc[::3, 0] = pd.NaT
df['B'] = df['A'] - df['A'].values.astype('timedelta64[D]')
# truncate fractional seconds
df['truncated'] = df['B'].values.astype('timedelta64[s]')
# round to nearest second
df['rounded'] = np.asarray(np.round(df['B'].values / np.timedelta64(1, 's')), dtype='timedelta64[s]')
print(df)
产量
A B truncated rounded
0 NaT NaT NaT NaT
1 1 days 06:51:51.111111 06:51:51.111111 06:51:51 06:51:51
2 2 days 13:43:42.222222 13:43:42.222222 13:43:42 13:43:42
3 NaT NaT NaT NaT
4 5 days 03:27:24.444444 03:27:24.444444 03:27:24 03:27:24
5 6 days 10:19:15.555556 10:19:15.555556 10:19:15 10:19:16
6 NaT NaT NaT NaT
7 9 days 00:02:57.777778 00:02:57.777778 00:02:57 00:02:58
8 10 days 06:54:48.888889 06:54:48.888889 06:54:48 06:54:49
9 NaT NaT NaT NaT
第 A
列显示原始 Timedelta。 B
列显示减去所有天数后的结果。 truncated
和 rounded
列显示舍去或舍入小数秒后的结果。
调用 astype('timedelta64[D]')
将 NumPy timedelta64s 截断为整天。
同样,调用 astype('timedelta64[s]')
会将 NumPy timedelta64s 截断为整秒。有关 datetime64/timedelta64 算术的更多信息,请参阅 the NumPy docs。
另一种减去天数的方法是使用:
df['B'] = df['A'] - pd.to_timedelta(df['A'].dt.days, unit='d')
但事实证明这样比较慢:
In [72]: df = pd.DataFrame({'A':pd.to_timedelta(np.linspace(0, 10**6, 1000), unit='s')})
In [73]: %timeit df['A'] - df['A'].values.astype('timedelta64[D]')
1000 loops, best of 3: 729 µs per loop
In [74]: %timeit df['A'] - pd.to_timedelta(df['A'].dt.days, unit='d')
100 loops, best of 3: 12.6 ms per loop
另一种舍入到最接近秒数的方法是:
df['rounded'] = pd.to_timedelta(df['B'].dt.total_seconds().round(), unit='s')
但是这又慢了:
In [104]: df = pd.DataFrame({'A':pd.to_timedelta(np.linspace(0, 10**6, 1000), unit='s')})
In [105]: df['B'] = df['A'] - df['A'].values.astype('timedelta64[D]')
In [106]: %timeit np.asarray(np.round(df['B'].values / np.timedelta64(1, 's')), dtype='timedelta64[s]')
10000 loops, best of 3: 27.7 µs per loop
In [107]: %timeit pd.to_timedelta(df['B'].dt.total_seconds().round(), unit='s')
100 loops, best of 3: 3.94 ms per loop