Pandas - 更改少于 n 个后续值相等的行

Pandas - changing rows where less than n subsequent values are equal

我有以下数据框:

df = pd.DataFrame({"col":[0,0,1,1,1,1,0,0,1,1,0,0,1,1,1,0,1,1,1,1,0,0,0]})

现在我想将所有行设置为零,其中少于四个 1 出现在“一行”中,即我想得到以下结果 DataFrame:

df = pd.DataFrame({"col":[0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0]})

我找不到很好地实现这一点的方法...

试试 groupbywhere:

streaks = df.groupby(df["col"].ne(df["col"].shift()).cumsum()).transform("sum")
output = df.where(streaks.ge(4), 0)

>>> output
    col
0     0
1     0
2     1
3     1
4     1
5     1
6     0
7     0
8     0
9     0
10    0
11    0
12    0
13    0
14    0
15    0
16    1
17    1
18    1
19    1
20    0
21    0
22    0

我们可以做到

df.loc[df.groupby(df.col.eq(0).cumsum()).transform('count')['col']<5,'col'] = 0
df
Out[77]: 
    col
0     0
1     0
2     1
3     1
4     1
5     1
6     0
7     0
8     0
9     0
10    0
11    0
12    0
13    0
14    0
15    0
16    1
17    1
18    1
19    1
20    0
21    0
22    0