满足 groupby 条件后添加行

Question

我正在尝试为数据框中的列查找 20 行或更多行的连续负值的数量。但是，一旦它以 20 个或更多的块分组，我想在每个块之后添加相应的 30 行原始数据帧。

这是我的尝试（从此处发布的问题中获得帮助）：

n = df['Slope'].lt(0)
mask = n.ne(n.shift()).cumsum()[n]
dfL = [g for i, g  in df.groupby(mask) if (len(g[g['Slope'] < 0]) >= 20)]
df_cn = pd.concat(dfL)

我得到了连续负值的块，但我不知道现在如何在每个块之后添加相应的 30 行。

Answer 1

下次请尝试提供最小的可重现示例和所需输出的小样本

我创建了一个随机的 dfL，效果很好

n = df['Slope'].lt(0)
mask = n.ne(n.shift()).cumsum()[n]
dfL = [g for i, g  in df.groupby(mask) if (len(g[g['Slope'] < 0]) >= 20)]

我从这里创建了代码：

for x in range(len(dfL)): # dfL is reaturning a list of dfs with each chunk
  if len(dfL)>0: # here I want to be sure, that we have a chunk in the dfL
     df_cn= dfL[x] # selecting chunk from dfL
     print('Chunk: df_cn_' + str(x) + ' created') # feedback for testing
     idx=dfL[x].index # last index from chunk # since chunk size >=20, we need to be sure to get the last index of it.
     print('Chunk from ' + str(min(idx)) + ' to ' + str(max(idx)) + ' total ' + str(len(dfL[x]))+' indexes in the chunk') # feedback with size of chunk
     df_rest=df.loc[max(idx)+1:max(idx)+31] # get the next 30 rows from original df based on max index from last chunk
     df_cn_ext = pd.concat([df_cn, df_rest]) # concatenate (join on Y-Achse) the chunk and 30rows of original df, if the 
     exec(f'df_cn_ext_{x}=df_cn_ext[:]') # creating separated dataframes trough suffixes for each chunk + 30 rows groups
     print('Dataframe df_cn_ext_' + str(x) + ' created from index ' + str(min(idx)) + ' to ' + str(max(idx)+31))
  else:
    print('no chunks in the df found')

请注意：

1- 我在新的 dfs 中用后缀 (df_cn_ext_suffix)

分隔了每个块+30 行

2-如果chunk的最后一个值接近dfL的末尾，它不会添加30行，而是增加可用的最大行数。

这里是我的代码的一些输出：

Chunk: df_cn_0 created
Chunk from 3 to 39 total 37 indexes in the chunk
Dataframe df_cn_ext_0 created from index 3 to 70
Chunk: df_cn_1 created
Chunk from 41 to 66 total 26 indexes in the chunk
Dataframe df_cn_ext_1 created from index 41 to 97

满足 groupby 条件后添加行

Adding rows after groupby condition is met

dataframe

pandas

pandas-groupby