无法将类型 DateOffset 添加到 TimedeltaArray
cannot add the type DateOffset to a TimedeltaArray
假设数据框 (df) 包含 3 列。
V1 V2 V3
1 0 days 23:09:00 0 days 23:34:00
1 0 days 23:36:00 1 days 00:03:00
1 1 days 00:06:00 1 days 00:29:00
1 1 days 00:31:00 1 days 00:57:00
2 0 days 22:40:00 0 days 23:04:00
2 0 days 23:09:00 0 days 23:35:00
2 0 days 23:37:00 1 days 00:01:00
2 1 days 00:06:00 1 days 00:30:00
2 1 days 00:33:00 1 days 00:56:00
3 0 days 22:50:00 0 days 23:21:09
3 0 days 23:38:56 1 days 00:09:00
3 1 days 00:12:00 1 days 00:42:09
我使用了以下代码:
df['V4']=(df.groupby('V1')['V3'] - df.groupby('V1')['V2'].shift(1)).astype('timedelta64[m]')
本质上,我想对 V1 中的每个唯一值执行操作,结果应如下所示:
V1 V2 V3 V4
1 0 days 23:09:00 0 days 23:34:00 NaN
1 0 days 23:36:00 1 days 00:03:00 54
1 1 days 00:06:00 1 days 00:29:00 53
1 1 days 00:31:00 1 days 00:57:00 51
2 0 days 22:40:00 0 days 23:04:00 NaN
2 0 days 23:09:00 0 days 23:35:00 55
2 0 days 23:37:00 1 days 00:01:00 52
2 1 days 00:06:00 1 days 00:30:00 53
2 1 days 00:33:00 1 days 00:56:00 50
3 0 days 22:50:00 0 days 23:21:09 NaN
3 0 days 23:38:56 1 days 00:09:00 79
3 1 days 00:12:00 1 days 00:42:09 63
收到错误:
Cannot add/subtract non-tick DateOffset to TimedeltaArray
数据类型:
{'V1': {1: 1, 2: 2, 3: 3}, 'V2': {0: Timedelta('0 days 23:09:00'), 1: Timedelta('0 days 23:36:00')}, 'V3': {0: Timedelta('0 days 23:34:00'), 1: Timedelta('1 days 00:03:00')}, 'V4': {0: 54, 1: 53}}
试试这个:
- 对所有行进行减法
- 当V1有变化时,设置为
NaN
df = df.sort_values(["V1", "V2", "V3"])
df["V4"] = (df["V3"]-df["V2"].shift()).dt.seconds//60
df["V4"] = df["V4"].where(df["V1"]==df["V1"].shift())
>>> df
V1 V2 V3 V4
0 1 0 days 23:09:00 0 days 23:34:00 NaN
1 1 0 days 23:36:00 1 days 00:03:00 54.0
2 1 1 days 00:06:00 1 days 00:29:00 53.0
3 1 1 days 00:31:00 1 days 00:57:00 51.0
4 2 0 days 22:40:00 0 days 23:04:00 NaN
5 2 0 days 23:09:00 0 days 23:35:00 55.0
6 2 0 days 23:37:00 1 days 00:01:00 52.0
7 2 1 days 00:06:00 1 days 00:30:00 53.0
8 2 1 days 00:33:00 1 days 00:56:00 50.0
9 3 0 days 22:50:00 0 days 23:21:09 NaN
10 3 0 days 23:38:56 1 days 00:09:00 79.0
11 3 1 days 00:12:00 1 days 00:42:09 63.0
如果你想使用groupby
:
df["V4"] = df.groupby("V1").apply(lambda x: (x["V3"]-x["V2"].shift()).dt.seconds//60).reset_index(drop=True)
>>> df
V1 V2 V3 V4
0 1 0 days 23:09:00 0 days 23:34:00 NaN
1 1 0 days 23:36:00 1 days 00:03:00 54.0
2 1 1 days 00:06:00 1 days 00:29:00 53.0
3 1 1 days 00:31:00 1 days 00:57:00 51.0
4 2 0 days 22:40:00 0 days 23:04:00 NaN
5 2 0 days 23:09:00 0 days 23:35:00 55.0
6 2 0 days 23:37:00 1 days 00:01:00 52.0
7 2 1 days 00:06:00 1 days 00:30:00 53.0
8 2 1 days 00:33:00 1 days 00:56:00 50.0
9 3 0 days 22:50:00 0 days 23:21:09 NaN
10 3 0 days 23:38:56 1 days 00:09:00 79.0
11 3 1 days 00:12:00 1 days 00:42:09 63.0
输入:
df = pd.DataFrame({"V1": [1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3],
"V2": [pd.Timedelta("0 days 23:09:00"), pd.Timedelta("0 days 23:36:00"), pd.Timedelta("1 days 00:06:00"), pd.Timedelta("1 days 00:31:00"),
pd.Timedelta("0 days 22:40:00"), pd.Timedelta("0 days 23:09:00"), pd.Timedelta("0 days 23:37:00"), pd.Timedelta("1 days 00:06:00"),
pd.Timedelta("1 days 00:33:00"), pd.Timedelta("0 days 22:50:00"), pd.Timedelta("0 days 23:38:56"), pd.Timedelta("1 days 00:12:00")],
"V3":[pd.Timedelta("0 days 23:34:00"), pd.Timedelta("1 days 00:03:00"), pd.Timedelta("1 days 00:29:00"), pd.Timedelta("1 days 00:57:00"),
pd.Timedelta("0 days 23:04:00"), pd.Timedelta("0 days 23:35:00"), pd.Timedelta("1 days 00:01:00"), pd.Timedelta("1 days 00:30:00"),
pd.Timedelta("1 days 00:56:00"), pd.Timedelta("0 days 23:21:09"), pd.Timedelta("1 days 00:09:00"), pd.Timedelta("1 days 00:42:09")]
})
>>> df
V1 V2 V3
0 1 0 days 23:09:00 0 days 23:34:00
1 1 0 days 23:36:00 1 days 00:03:00
2 1 1 days 00:06:00 1 days 00:29:00
3 1 1 days 00:31:00 1 days 00:57:00
4 2 0 days 22:40:00 0 days 23:04:00
5 2 0 days 23:09:00 0 days 23:35:00
6 2 0 days 23:37:00 1 days 00:01:00
7 2 1 days 00:06:00 1 days 00:30:00
8 2 1 days 00:33:00 1 days 00:56:00
9 3 0 days 22:50:00 0 days 23:21:09
10 3 0 days 23:38:56 1 days 00:09:00
11 3 1 days 00:12:00 1 days 00:42:09
假设数据框 (df) 包含 3 列。
V1 V2 V3
1 0 days 23:09:00 0 days 23:34:00
1 0 days 23:36:00 1 days 00:03:00
1 1 days 00:06:00 1 days 00:29:00
1 1 days 00:31:00 1 days 00:57:00
2 0 days 22:40:00 0 days 23:04:00
2 0 days 23:09:00 0 days 23:35:00
2 0 days 23:37:00 1 days 00:01:00
2 1 days 00:06:00 1 days 00:30:00
2 1 days 00:33:00 1 days 00:56:00
3 0 days 22:50:00 0 days 23:21:09
3 0 days 23:38:56 1 days 00:09:00
3 1 days 00:12:00 1 days 00:42:09
我使用了以下代码:
df['V4']=(df.groupby('V1')['V3'] - df.groupby('V1')['V2'].shift(1)).astype('timedelta64[m]')
本质上,我想对 V1 中的每个唯一值执行操作,结果应如下所示:
V1 V2 V3 V4
1 0 days 23:09:00 0 days 23:34:00 NaN
1 0 days 23:36:00 1 days 00:03:00 54
1 1 days 00:06:00 1 days 00:29:00 53
1 1 days 00:31:00 1 days 00:57:00 51
2 0 days 22:40:00 0 days 23:04:00 NaN
2 0 days 23:09:00 0 days 23:35:00 55
2 0 days 23:37:00 1 days 00:01:00 52
2 1 days 00:06:00 1 days 00:30:00 53
2 1 days 00:33:00 1 days 00:56:00 50
3 0 days 22:50:00 0 days 23:21:09 NaN
3 0 days 23:38:56 1 days 00:09:00 79
3 1 days 00:12:00 1 days 00:42:09 63
收到错误:
Cannot add/subtract non-tick DateOffset to TimedeltaArray
数据类型:
{'V1': {1: 1, 2: 2, 3: 3}, 'V2': {0: Timedelta('0 days 23:09:00'), 1: Timedelta('0 days 23:36:00')}, 'V3': {0: Timedelta('0 days 23:34:00'), 1: Timedelta('1 days 00:03:00')}, 'V4': {0: 54, 1: 53}}
试试这个:
- 对所有行进行减法
- 当V1有变化时,设置为
NaN
df = df.sort_values(["V1", "V2", "V3"])
df["V4"] = (df["V3"]-df["V2"].shift()).dt.seconds//60
df["V4"] = df["V4"].where(df["V1"]==df["V1"].shift())
>>> df
V1 V2 V3 V4
0 1 0 days 23:09:00 0 days 23:34:00 NaN
1 1 0 days 23:36:00 1 days 00:03:00 54.0
2 1 1 days 00:06:00 1 days 00:29:00 53.0
3 1 1 days 00:31:00 1 days 00:57:00 51.0
4 2 0 days 22:40:00 0 days 23:04:00 NaN
5 2 0 days 23:09:00 0 days 23:35:00 55.0
6 2 0 days 23:37:00 1 days 00:01:00 52.0
7 2 1 days 00:06:00 1 days 00:30:00 53.0
8 2 1 days 00:33:00 1 days 00:56:00 50.0
9 3 0 days 22:50:00 0 days 23:21:09 NaN
10 3 0 days 23:38:56 1 days 00:09:00 79.0
11 3 1 days 00:12:00 1 days 00:42:09 63.0
如果你想使用groupby
:
df["V4"] = df.groupby("V1").apply(lambda x: (x["V3"]-x["V2"].shift()).dt.seconds//60).reset_index(drop=True)
>>> df
V1 V2 V3 V4
0 1 0 days 23:09:00 0 days 23:34:00 NaN
1 1 0 days 23:36:00 1 days 00:03:00 54.0
2 1 1 days 00:06:00 1 days 00:29:00 53.0
3 1 1 days 00:31:00 1 days 00:57:00 51.0
4 2 0 days 22:40:00 0 days 23:04:00 NaN
5 2 0 days 23:09:00 0 days 23:35:00 55.0
6 2 0 days 23:37:00 1 days 00:01:00 52.0
7 2 1 days 00:06:00 1 days 00:30:00 53.0
8 2 1 days 00:33:00 1 days 00:56:00 50.0
9 3 0 days 22:50:00 0 days 23:21:09 NaN
10 3 0 days 23:38:56 1 days 00:09:00 79.0
11 3 1 days 00:12:00 1 days 00:42:09 63.0
输入:
df = pd.DataFrame({"V1": [1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3],
"V2": [pd.Timedelta("0 days 23:09:00"), pd.Timedelta("0 days 23:36:00"), pd.Timedelta("1 days 00:06:00"), pd.Timedelta("1 days 00:31:00"),
pd.Timedelta("0 days 22:40:00"), pd.Timedelta("0 days 23:09:00"), pd.Timedelta("0 days 23:37:00"), pd.Timedelta("1 days 00:06:00"),
pd.Timedelta("1 days 00:33:00"), pd.Timedelta("0 days 22:50:00"), pd.Timedelta("0 days 23:38:56"), pd.Timedelta("1 days 00:12:00")],
"V3":[pd.Timedelta("0 days 23:34:00"), pd.Timedelta("1 days 00:03:00"), pd.Timedelta("1 days 00:29:00"), pd.Timedelta("1 days 00:57:00"),
pd.Timedelta("0 days 23:04:00"), pd.Timedelta("0 days 23:35:00"), pd.Timedelta("1 days 00:01:00"), pd.Timedelta("1 days 00:30:00"),
pd.Timedelta("1 days 00:56:00"), pd.Timedelta("0 days 23:21:09"), pd.Timedelta("1 days 00:09:00"), pd.Timedelta("1 days 00:42:09")]
})
>>> df
V1 V2 V3
0 1 0 days 23:09:00 0 days 23:34:00
1 1 0 days 23:36:00 1 days 00:03:00
2 1 1 days 00:06:00 1 days 00:29:00
3 1 1 days 00:31:00 1 days 00:57:00
4 2 0 days 22:40:00 0 days 23:04:00
5 2 0 days 23:09:00 0 days 23:35:00
6 2 0 days 23:37:00 1 days 00:01:00
7 2 1 days 00:06:00 1 days 00:30:00
8 2 1 days 00:33:00 1 days 00:56:00
9 3 0 days 22:50:00 0 days 23:21:09
10 3 0 days 23:38:56 1 days 00:09:00
11 3 1 days 00:12:00 1 days 00:42:09