如果满足条件,则根据最后一个非零值在 Pandas 列中填充零值
Fill Zero values in Pandas column based on last non-zero value if a criteria is fulfilled
考虑一个 Pandas DataFrame test = pd.DataFrame(data = [0, 0, 1, 0, 0, 0, -1, 0, 0, 0, 1, 0, 0], columns = ['holding'])
Output:
+----------+
| Holdings |
+----------+
| 0 |
| 0 |
| 1 |
| 0 |
| 0 |
| 0 |
| -1 |
| 0 |
| 0 |
| 0 |
| 1 |
| 0 |
| 0 |
+----------+
如果最后一个非零值等于1,我想用最后一个非零值替换所有零值。如果最后一个非零值等于-1,则不需要将 0 替换为 1.
我试过 test['position_holding'] = test['holding'].replace(to_replace=0, method='ffill')
结果是
+------------------+
| position_holding |
+------------------+
| 0 |
| 0 |
| 1 |
| 1 |
| 1 |
| 1 |
| -1 |
| -1 |
| -1 |
| -1 |
| 1 |
| 1 |
| 1 |
+------------------+
我在上面 table 中唯一需要修复的是用 -1 填充零,这违反了第二个条件。我怎样才能做到这一点?
Desired Output:
+------------------+
| position_holding |
+------------------+
| 0 |
| 0 |
| 1 |
| 1 |
| 1 |
| 1 |
| -1 |
| 0 |
| 0 |
| 0 |
| 1 |
| 1 |
| 1 |
+------------------+
我的做法:
after = test.holding.eq(1)
before = test.holding.eq(-1)
test['pos_holding'] = test.holding.mask(test.holding.where(after|before).ffill()==1,1)
等效代码,稍微短一点:
mask = test.holding.where(test.holding != 0).ffill()
test['pos_holding'] = test.holding.mask(mask==1, 1)
输出:
holding pos_holding
0 0 0
1 0 0
2 1 1
3 0 1
4 0 1
5 0 1
6 -1 -1
7 0 0
8 0 0
9 0 0
10 1 1
11 0 1
12 0 1
不使用 pandas 或 numpy,但一个简单的 for 循环也可以。
for i in range(1, len(test)):
if(test['holding'][i] == 0 and test['holding'][i-1] == 1):
test['holding'][i] = 1
这应该有效
test = pd.DataFrame(data = [0, 0, 1, 0, 0, 0, -1, 0, 0, 0, 1, 0, 0],
columns = ['holding'])
test['position_holding'] = test['holding'].replace(to_replace=0, method='ffill')
test["Diff"] = test["holding"]-test["position_holding"]
test.loc[test["Diff"]==1, 'position_holding']=0
然后您可以删除现在无用的 Diff 列。
考虑一个 Pandas DataFrame test = pd.DataFrame(data = [0, 0, 1, 0, 0, 0, -1, 0, 0, 0, 1, 0, 0], columns = ['holding'])
Output:
+----------+
| Holdings |
+----------+
| 0 |
| 0 |
| 1 |
| 0 |
| 0 |
| 0 |
| -1 |
| 0 |
| 0 |
| 0 |
| 1 |
| 0 |
| 0 |
+----------+
如果最后一个非零值等于1,我想用最后一个非零值替换所有零值。如果最后一个非零值等于-1,则不需要将 0 替换为 1.
我试过 test['position_holding'] = test['holding'].replace(to_replace=0, method='ffill')
结果是
+------------------+
| position_holding |
+------------------+
| 0 |
| 0 |
| 1 |
| 1 |
| 1 |
| 1 |
| -1 |
| -1 |
| -1 |
| -1 |
| 1 |
| 1 |
| 1 |
+------------------+
我在上面 table 中唯一需要修复的是用 -1 填充零,这违反了第二个条件。我怎样才能做到这一点?
Desired Output:
+------------------+
| position_holding |
+------------------+
| 0 |
| 0 |
| 1 |
| 1 |
| 1 |
| 1 |
| -1 |
| 0 |
| 0 |
| 0 |
| 1 |
| 1 |
| 1 |
+------------------+
我的做法:
after = test.holding.eq(1)
before = test.holding.eq(-1)
test['pos_holding'] = test.holding.mask(test.holding.where(after|before).ffill()==1,1)
等效代码,稍微短一点:
mask = test.holding.where(test.holding != 0).ffill()
test['pos_holding'] = test.holding.mask(mask==1, 1)
输出:
holding pos_holding
0 0 0
1 0 0
2 1 1
3 0 1
4 0 1
5 0 1
6 -1 -1
7 0 0
8 0 0
9 0 0
10 1 1
11 0 1
12 0 1
不使用 pandas 或 numpy,但一个简单的 for 循环也可以。
for i in range(1, len(test)):
if(test['holding'][i] == 0 and test['holding'][i-1] == 1):
test['holding'][i] = 1
这应该有效
test = pd.DataFrame(data = [0, 0, 1, 0, 0, 0, -1, 0, 0, 0, 1, 0, 0],
columns = ['holding'])
test['position_holding'] = test['holding'].replace(to_replace=0, method='ffill')
test["Diff"] = test["holding"]-test["position_holding"]
test.loc[test["Diff"]==1, 'position_holding']=0
然后您可以删除现在无用的 Diff 列。