如何在pandas中同时进行前后滚动?
how to do forward and backward rolling at the same time in pandas?
我需要检查当前值是否高于下两个和前三个值。否则,它传递前一个数字。
例如:
date
pos
2007-01-01
3.0
2007-01-02
5.0
2007-01-03
6.0
2007-01-10
11.0
2007-01-11
9.0
2007-01-20
8.0
2007-01-21
10.0
2007-01-22
13.0
2007-01-23
4.0
2007-01-27
2.0
2007-01-28
1.0
2007-01-29
2.0
至
date
pos
2007-01-01
NA
2007-01-02
NA
2007-01-03
NA
2007-01-10
11.0
2007-01-11
11.0
2007-01-20
11.0
2007-01-21
11.0
2007-01-22
13.0
2007-01-23
13.0
2007-01-27
13.0
2007-01-28
13.0
2007-01-29
13.0
我知道如何分别进行后滚和前滚。但是想不通怎么同时做。
您可以使用 np.where
将不符合您条件的值替换为 nan
,然后使用 ffill
向前转换先前的值。
在这种情况下,我们对第一个条件使用扩展 window,最小周期为 4,对于第二个条件,我们反转数据并滚动 window 2。
import pandas as pd
import numpy as np
df = pd.DataFrame({'date': {0: '2007-01-01',
1: '2007-01-02', 2: '2007-01-03', 3: '2007-01-10', 4: '2007-01-11',
5: '2007-01-20', 6: '2007-01-21', 7: '2007-01-22', 8: '2007-01-23',
9: '2007-01-27', 10: '2007-01-28', 11: '2007-01-29'},
'pos': {0: 3.0, 1: 5.0, 2: 6.0, 3: 11.0, 4: 9.0, 5: 8.0,
6: 10.0, 7: 13.0, 8: 4.0, 9: 2.0, 10: 1.0, 11: 2.0}})
df.pos = np.where((df.pos.ge(df.pos.rolling(len(df), min_periods=4).max())) &
(df.pos.ge(df.iloc[::-1].pos.rolling(2).max())),
df.pos,np.nan)
df.ffill()
输出
date pos
0 2007-01-01 NaN
1 2007-01-02 NaN
2 2007-01-03 NaN
3 2007-01-10 11.0
4 2007-01-11 11.0
5 2007-01-20 11.0
6 2007-01-21 11.0
7 2007-01-22 13.0
8 2007-01-23 13.0
9 2007-01-27 13.0
10 2007-01-28 13.0
11 2007-01-29 13.0
我需要检查当前值是否高于下两个和前三个值。否则,它传递前一个数字。
例如:
date | pos |
---|---|
2007-01-01 | 3.0 |
2007-01-02 | 5.0 |
2007-01-03 | 6.0 |
2007-01-10 | 11.0 |
2007-01-11 | 9.0 |
2007-01-20 | 8.0 |
2007-01-21 | 10.0 |
2007-01-22 | 13.0 |
2007-01-23 | 4.0 |
2007-01-27 | 2.0 |
2007-01-28 | 1.0 |
2007-01-29 | 2.0 |
至
date | pos |
---|---|
2007-01-01 | NA |
2007-01-02 | NA |
2007-01-03 | NA |
2007-01-10 | 11.0 |
2007-01-11 | 11.0 |
2007-01-20 | 11.0 |
2007-01-21 | 11.0 |
2007-01-22 | 13.0 |
2007-01-23 | 13.0 |
2007-01-27 | 13.0 |
2007-01-28 | 13.0 |
2007-01-29 | 13.0 |
我知道如何分别进行后滚和前滚。但是想不通怎么同时做。
您可以使用 np.where
将不符合您条件的值替换为 nan
,然后使用 ffill
向前转换先前的值。
在这种情况下,我们对第一个条件使用扩展 window,最小周期为 4,对于第二个条件,我们反转数据并滚动 window 2。
import pandas as pd
import numpy as np
df = pd.DataFrame({'date': {0: '2007-01-01',
1: '2007-01-02', 2: '2007-01-03', 3: '2007-01-10', 4: '2007-01-11',
5: '2007-01-20', 6: '2007-01-21', 7: '2007-01-22', 8: '2007-01-23',
9: '2007-01-27', 10: '2007-01-28', 11: '2007-01-29'},
'pos': {0: 3.0, 1: 5.0, 2: 6.0, 3: 11.0, 4: 9.0, 5: 8.0,
6: 10.0, 7: 13.0, 8: 4.0, 9: 2.0, 10: 1.0, 11: 2.0}})
df.pos = np.where((df.pos.ge(df.pos.rolling(len(df), min_periods=4).max())) &
(df.pos.ge(df.iloc[::-1].pos.rolling(2).max())),
df.pos,np.nan)
df.ffill()
输出
date pos
0 2007-01-01 NaN
1 2007-01-02 NaN
2 2007-01-03 NaN
3 2007-01-10 11.0
4 2007-01-11 11.0
5 2007-01-20 11.0
6 2007-01-21 11.0
7 2007-01-22 13.0
8 2007-01-23 13.0
9 2007-01-27 13.0
10 2007-01-28 13.0
11 2007-01-29 13.0