无法将类型 DateOffset 添加到 TimedeltaArray

cannot add the type DateOffset to a TimedeltaArray

假设数据框 (df) 包含 3 列。

V1  V2              V3
1   0 days 23:09:00 0 days 23:34:00
1   0 days 23:36:00 1 days 00:03:00
1   1 days 00:06:00 1 days 00:29:00
1   1 days 00:31:00 1 days 00:57:00
2   0 days 22:40:00 0 days 23:04:00
2   0 days 23:09:00 0 days 23:35:00
2   0 days 23:37:00 1 days 00:01:00
2   1 days 00:06:00 1 days 00:30:00
2   1 days 00:33:00 1 days 00:56:00
3   0 days 22:50:00 0 days 23:21:09
3   0 days 23:38:56 1 days 00:09:00
3   1 days 00:12:00 1 days 00:42:09

我使用了以下代码:

df['V4']=(df.groupby('V1')['V3'] - df.groupby('V1')['V2'].shift(1)).astype('timedelta64[m]')

本质上,我想对 V1 中的每个唯一值执行操作,结果应如下所示:

V1  V2              V3              V4
1   0 days 23:09:00 0 days 23:34:00 NaN
1   0 days 23:36:00 1 days 00:03:00 54
1   1 days 00:06:00 1 days 00:29:00 53
1   1 days 00:31:00 1 days 00:57:00 51
2   0 days 22:40:00 0 days 23:04:00 NaN
2   0 days 23:09:00 0 days 23:35:00 55
2   0 days 23:37:00 1 days 00:01:00 52
2   1 days 00:06:00 1 days 00:30:00 53
2   1 days 00:33:00 1 days 00:56:00 50
3   0 days 22:50:00 0 days 23:21:09 NaN
3   0 days 23:38:56 1 days 00:09:00 79
3   1 days 00:12:00 1 days 00:42:09 63

收到错误:

Cannot add/subtract non-tick DateOffset to TimedeltaArray

数据类型:

{'V1': {1: 1, 2: 2, 3: 3}, 'V2': {0: Timedelta('0 days 23:09:00'), 1: Timedelta('0 days 23:36:00')}, 'V3': {0: Timedelta('0 days 23:34:00'), 1: Timedelta('1 days 00:03:00')}, 'V4': {0: 54, 1: 53}}

试试这个:

  1. 对所有行进行减法
  2. 当V1有变化时,设置为NaN
df = df.sort_values(["V1", "V2", "V3"])
df["V4"] = (df["V3"]-df["V2"].shift()).dt.seconds//60
df["V4"] = df["V4"].where(df["V1"]==df["V1"].shift())

>>> df
    V1              V2              V3    V4
0    1 0 days 23:09:00 0 days 23:34:00   NaN
1    1 0 days 23:36:00 1 days 00:03:00  54.0
2    1 1 days 00:06:00 1 days 00:29:00  53.0
3    1 1 days 00:31:00 1 days 00:57:00  51.0
4    2 0 days 22:40:00 0 days 23:04:00   NaN
5    2 0 days 23:09:00 0 days 23:35:00  55.0
6    2 0 days 23:37:00 1 days 00:01:00  52.0
7    2 1 days 00:06:00 1 days 00:30:00  53.0
8    2 1 days 00:33:00 1 days 00:56:00  50.0
9    3 0 days 22:50:00 0 days 23:21:09   NaN
10   3 0 days 23:38:56 1 days 00:09:00  79.0
11   3 1 days 00:12:00 1 days 00:42:09  63.0

如果你想使用groupby:

df["V4"] = df.groupby("V1").apply(lambda x: (x["V3"]-x["V2"].shift()).dt.seconds//60).reset_index(drop=True)

>>> df
    V1              V2              V3    V4
0    1 0 days 23:09:00 0 days 23:34:00   NaN
1    1 0 days 23:36:00 1 days 00:03:00  54.0
2    1 1 days 00:06:00 1 days 00:29:00  53.0
3    1 1 days 00:31:00 1 days 00:57:00  51.0
4    2 0 days 22:40:00 0 days 23:04:00   NaN
5    2 0 days 23:09:00 0 days 23:35:00  55.0
6    2 0 days 23:37:00 1 days 00:01:00  52.0
7    2 1 days 00:06:00 1 days 00:30:00  53.0
8    2 1 days 00:33:00 1 days 00:56:00  50.0
9    3 0 days 22:50:00 0 days 23:21:09   NaN
10   3 0 days 23:38:56 1 days 00:09:00  79.0
11   3 1 days 00:12:00 1 days 00:42:09  63.0
输入:
df = pd.DataFrame({"V1": [1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3],
                   "V2": [pd.Timedelta("0 days 23:09:00"), pd.Timedelta("0 days 23:36:00"), pd.Timedelta("1 days 00:06:00"), pd.Timedelta("1 days 00:31:00"),
                          pd.Timedelta("0 days 22:40:00"), pd.Timedelta("0 days 23:09:00"), pd.Timedelta("0 days 23:37:00"), pd.Timedelta("1 days 00:06:00"),
                          pd.Timedelta("1 days 00:33:00"), pd.Timedelta("0 days 22:50:00"), pd.Timedelta("0 days 23:38:56"), pd.Timedelta("1 days 00:12:00")],
                   "V3":[pd.Timedelta("0 days 23:34:00"), pd.Timedelta("1 days 00:03:00"), pd.Timedelta("1 days 00:29:00"), pd.Timedelta("1 days 00:57:00"),
                         pd.Timedelta("0 days 23:04:00"), pd.Timedelta("0 days 23:35:00"), pd.Timedelta("1 days 00:01:00"), pd.Timedelta("1 days 00:30:00"),
                         pd.Timedelta("1 days 00:56:00"), pd.Timedelta("0 days 23:21:09"), pd.Timedelta("1 days 00:09:00"), pd.Timedelta("1 days 00:42:09")]
                   })

>>> df
    V1              V2              V3
0    1 0 days 23:09:00 0 days 23:34:00
1    1 0 days 23:36:00 1 days 00:03:00
2    1 1 days 00:06:00 1 days 00:29:00
3    1 1 days 00:31:00 1 days 00:57:00
4    2 0 days 22:40:00 0 days 23:04:00
5    2 0 days 23:09:00 0 days 23:35:00
6    2 0 days 23:37:00 1 days 00:01:00
7    2 1 days 00:06:00 1 days 00:30:00
8    2 1 days 00:33:00 1 days 00:56:00
9    3 0 days 22:50:00 0 days 23:21:09
10   3 0 days 23:38:56 1 days 00:09:00
11   3 1 days 00:12:00 1 days 00:42:09