如何将具有特定 id 的行与 pandas 中的前一行合并?
How to merge rows with a specific id with the previous row in pandas?
我有一个具有不同值的时间序列数据框:
ID TimeString value1 value2 StampDif
0 2021-02-10 17:30:39 0.5 5.2 NaT
1 2021-02-10 17:33:39 0.7 5.5 0 days 00:03:00
2 2021-02-10 17:36:40 0.9 5.5 0 days 00:03:01
3 2021-02-10 17:39:40 0.6 5.4 0 days 00:03:00
4 2021-02-10 17:42:40 0.8 5.0 0 days 00:00:01
.
.
.
现在我想使用均值将 Stampdif 为 1 秒的所有行与前一行合并。我试过:
secdf = df[df["StampDif"] <= pd.Timedelta(1, "sec")]
for idx, row in secdf.iterrows():
df.iloc[idx-1, dfnanpv.columns != ["TimeString", "StampDif"]] = df.iloc[idx-1:idx+1].mean(axis=0)
但它抛出错误:'Shapes must match', (25,), (2,)
因此,我想在第 3 行中添加以下示例:
ID TimeString value1 value2 StampDif
3 2021-02-10 17:39:40 0.7 5.2 0 days 00:03:00
试试这个:
exclude_columns = ['col1', 'col2']
new_df = df.groupby((df['StampDif'] > pd.Timedelta(1, 'second')).cumsum()).agg({col: 'mean' for col in df.columns.difference(exclude_columns)}).reset_index(drop=True)
输出:
>>> new_df
TimeString value1 value2
0 2021-02-10 17:30:39 0.5 5.2
1 2021-02-10 17:33:39 0.7 5.5
2 2021-02-10 17:36:40 0.9 5.5
3 2021-02-10 17:41:10 0.7 5.2
我有一个具有不同值的时间序列数据框:
ID TimeString value1 value2 StampDif
0 2021-02-10 17:30:39 0.5 5.2 NaT
1 2021-02-10 17:33:39 0.7 5.5 0 days 00:03:00
2 2021-02-10 17:36:40 0.9 5.5 0 days 00:03:01
3 2021-02-10 17:39:40 0.6 5.4 0 days 00:03:00
4 2021-02-10 17:42:40 0.8 5.0 0 days 00:00:01
.
.
.
现在我想使用均值将 Stampdif 为 1 秒的所有行与前一行合并。我试过:
secdf = df[df["StampDif"] <= pd.Timedelta(1, "sec")]
for idx, row in secdf.iterrows():
df.iloc[idx-1, dfnanpv.columns != ["TimeString", "StampDif"]] = df.iloc[idx-1:idx+1].mean(axis=0)
但它抛出错误:'Shapes must match', (25,), (2,)
因此,我想在第 3 行中添加以下示例:
ID TimeString value1 value2 StampDif
3 2021-02-10 17:39:40 0.7 5.2 0 days 00:03:00
试试这个:
exclude_columns = ['col1', 'col2']
new_df = df.groupby((df['StampDif'] > pd.Timedelta(1, 'second')).cumsum()).agg({col: 'mean' for col in df.columns.difference(exclude_columns)}).reset_index(drop=True)
输出:
>>> new_df
TimeString value1 value2
0 2021-02-10 17:30:39 0.5 5.2
1 2021-02-10 17:33:39 0.7 5.5
2 2021-02-10 17:36:40 0.9 5.5
3 2021-02-10 17:41:10 0.7 5.2