Pandas 转移到非 NaN 值并检查是否重复输入
Pandas Shift to Non-NaN Value and Check if Duplicate Entry
如果有人能帮助我将 df.Decision 列隔离为一行中的单个“买入”或“卖出”实例。例如,如果有 3 个“购买”决策,无论它们之间是否有 NaN 值分隔,我只需要保留第一个“购买”。类似的逻辑适用于“卖出”。
当前数据
Date | ColA | ColB | ColC | Decision |
---|---|---|---|---|
2018-03-21 | 41.6871345068477 | 39.1196017702354 | 39.8100609746974 | |
2018-03-22 | 41.83569164767 | 39.1196017702354 | 39.8100609746974 | Buy |
2018-04-02 | 42.0277334284587 | 39.5353679158337 | 39.8100609746974 | Buy |
2018-04-30 | 41.0131864593112 | 42.1811215382421 | 40.3090368348783 | |
2018-05-01 | 41.0131864593112 | 42.0844982888835 | 40.3090368348783 | |
2018-05-02 | 41.0131864593112 | 41.9045373766682 | 40.3090368348783 | Buy |
2018-09-28 | 54.0546533518404 | 50.7025748467743 | 48.5804868844005 | |
2018-10-01 | 54.1167056669686 | 50.7652351622538 | 48.5804868844005 | |
2018-10-02 | 54.179969640969 | 50.7993057048438 | 48.5804868844005 | Buy |
2018-10-03 | 54.6021709547574 | 50.8035639654775 | 48.5804868844005 | Buy |
2018-10-04 | 54.6021709547574 | 51.1600610997758 | 48.7459608850365 | |
2018-11-01 | 53.4815867079232 | 53.8788384068764 | 50.8680059009101 | |
2018-11-02 | 53.4012843800357 | 53.8545041548076 | 50.8680059009101 | Sell |
2018-11-05 | 52.5179537180688 | 53.9007386980484 | 50.8680059009101 | Sell |
2018-11-06 | 52.5179537180688 | 54.1130540704967 | 50.8680059009101 | Sell |
2018-11-07 | 52.5179537180688 | 54.2608827598324 | 50.9081548909462 | |
2018-11-08 | 52.381683825919 | 54.6830840736208 | 51.3303562047346 | Sell |
2018-11-09 | 51.9022943297893 | 54.6830840736208 | 51.3303562047346 | Sell |
2018-11-12 | 51.312945372196 | 54.869846946646 | 51.3303562047346 | Sell |
2018-11-13 | 51.0272439215888 | 54.873497352104 | 51.3303562047346 | Sell |
2019-02-28 | 40.0868369032957 | 37.9514787484214 | 42.9921818000566 | |
2019-03-01 | 40.0917199269724 | 37.7384198717488 | 42.9921818000566 | |
2019-03-04 | 40.5566646362643 | 37.6938570296322 | 42.9921818000566 | Buy |
2019-04-23 | 48.1070706672322 | 43.6878883048808 | 40.3077255381675 | Buy |
2019-04-24 | 48.1965810367431 | 43.817865832258 | 40.4377030655446 | |
2019-04-25 | 48.1965810367431 | 43.9423243081189 | 40.5112749854225 | |
2019-04-26 | 48.1965810367431 | 44.0116014371635 | 40.7923506041967 | Buy |
2019-04-29 | 48.1965810367431 | 45.2089733480352 | 41.8874654967458 |
预期数据
Date | ColA | ColB | ColC | Decision |
---|---|---|---|---|
2018-03-21 | 41.6871345068477 | 39.1196017702354 | 39.8100609746974 | |
2018-03-22 | 41.83569164767 | 39.1196017702354 | 39.8100609746974 | Buy |
2018-04-02 | 42.0277334284587 | 39.5353679158337 | 39.8100609746974 | |
2018-04-30 | 41.0131864593112 | 42.1811215382421 | 40.3090368348783 | |
2018-05-01 | 41.0131864593112 | 42.0844982888835 | 40.3090368348783 | |
2018-05-02 | 41.0131864593112 | 41.9045373766682 | 40.3090368348783 | |
2018-09-28 | 54.0546533518404 | 50.7025748467743 | 48.5804868844005 | |
2018-10-01 | 54.1167056669686 | 50.7652351622538 | 48.5804868844005 | |
2018-10-02 | 54.179969640969 | 50.7993057048438 | 48.5804868844005 | |
2018-10-03 | 54.6021709547574 | 50.8035639654775 | 48.5804868844005 | |
2018-10-04 | 54.6021709547574 | 51.1600610997758 | 48.7459608850365 | |
2018-11-01 | 53.4815867079232 | 53.8788384068764 | 50.8680059009101 | |
2018-11-02 | 53.4012843800357 | 53.8545041548076 | 50.8680059009101 | Sell |
2018-11-05 | 52.5179537180688 | 53.9007386980484 | 50.8680059009101 | |
2018-11-06 | 52.5179537180688 | 54.1130540704967 | 50.8680059009101 | |
2018-11-07 | 52.5179537180688 | 54.2608827598324 | 50.9081548909462 | |
2018-11-08 | 52.381683825919 | 54.6830840736208 | 51.3303562047346 | |
2018-11-09 | 51.9022943297893 | 54.6830840736208 | 51.3303562047346 | |
2018-11-12 | 51.312945372196 | 54.869846946646 | 51.3303562047346 | |
2018-11-13 | 51.0272439215888 | 54.873497352104 | 51.3303562047346 | |
2019-02-28 | 40.0868369032957 | 37.9514787484214 | 42.9921818000566 | |
2019-03-01 | 40.0917199269724 | 37.7384198717488 | 42.9921818000566 | |
2019-03-04 | 40.5566646362643 | 37.6938570296322 | 42.9921818000566 | Buy |
2019-04-23 | 48.1070706672322 | 43.6878883048808 | 40.3077255381675 | |
2019-04-24 | 48.1965810367431 | 43.817865832258 | 40.4377030655446 | |
2019-04-25 | 48.1965810367431 | 43.9423243081189 | 40.5112749854225 | |
2019-04-26 | 48.1965810367431 | 44.0116014371635 | 40.7923506041967 | |
2019-04-29 | 48.1965810367431 | 45.2089733480352 | 41.8874654967458 |
为了解决这个问题,我开始使用以下逻辑,但我无法让它正常工作。
df[df.Decision.notnull()].shift().eq('Buy').Decision
这些是决定不变的行:
rows = df['Decision'].ffill() == df['Decision'].ffill().shift(1)
将他们的决策标签转换为 NaN
:
df.loc[rows, 'Decision'] = np.nan