Pandas 转移到非 NaN 值并检查是否重复输入
Pandas Shift to Non-NaN Value and Check if Duplicate Entry
如果有人能帮助我将 df.Decision 列隔离为一行中的单个“买入”或“卖出”实例。例如,如果有 3 个“购买”决策,无论它们之间是否有 NaN 值分隔,我只需要保留第一个“购买”。类似的逻辑适用于“卖出”。
当前数据
Date
ColA
ColB
ColC
Decision
2018-03-21
41.6871345068477
39.1196017702354
39.8100609746974
2018-03-22
41.83569164767
39.1196017702354
39.8100609746974
Buy
2018-04-02
42.0277334284587
39.5353679158337
39.8100609746974
Buy
2018-04-30
41.0131864593112
42.1811215382421
40.3090368348783
2018-05-01
41.0131864593112
42.0844982888835
40.3090368348783
2018-05-02
41.0131864593112
41.9045373766682
40.3090368348783
Buy
2018-09-28
54.0546533518404
50.7025748467743
48.5804868844005
2018-10-01
54.1167056669686
50.7652351622538
48.5804868844005
2018-10-02
54.179969640969
50.7993057048438
48.5804868844005
Buy
2018-10-03
54.6021709547574
50.8035639654775
48.5804868844005
Buy
2018-10-04
54.6021709547574
51.1600610997758
48.7459608850365
2018-11-01
53.4815867079232
53.8788384068764
50.8680059009101
2018-11-02
53.4012843800357
53.8545041548076
50.8680059009101
Sell
2018-11-05
52.5179537180688
53.9007386980484
50.8680059009101
Sell
2018-11-06
52.5179537180688
54.1130540704967
50.8680059009101
Sell
2018-11-07
52.5179537180688
54.2608827598324
50.9081548909462
2018-11-08
52.381683825919
54.6830840736208
51.3303562047346
Sell
2018-11-09
51.9022943297893
54.6830840736208
51.3303562047346
Sell
2018-11-12
51.312945372196
54.869846946646
51.3303562047346
Sell
2018-11-13
51.0272439215888
54.873497352104
51.3303562047346
Sell
2019-02-28
40.0868369032957
37.9514787484214
42.9921818000566
2019-03-01
40.0917199269724
37.7384198717488
42.9921818000566
2019-03-04
40.5566646362643
37.6938570296322
42.9921818000566
Buy
2019-04-23
48.1070706672322
43.6878883048808
40.3077255381675
Buy
2019-04-24
48.1965810367431
43.817865832258
40.4377030655446
2019-04-25
48.1965810367431
43.9423243081189
40.5112749854225
2019-04-26
48.1965810367431
44.0116014371635
40.7923506041967
Buy
2019-04-29
48.1965810367431
45.2089733480352
41.8874654967458
预期数据
Date
ColA
ColB
ColC
Decision
2018-03-21
41.6871345068477
39.1196017702354
39.8100609746974
2018-03-22
41.83569164767
39.1196017702354
39.8100609746974
Buy
2018-04-02
42.0277334284587
39.5353679158337
39.8100609746974
2018-04-30
41.0131864593112
42.1811215382421
40.3090368348783
2018-05-01
41.0131864593112
42.0844982888835
40.3090368348783
2018-05-02
41.0131864593112
41.9045373766682
40.3090368348783
2018-09-28
54.0546533518404
50.7025748467743
48.5804868844005
2018-10-01
54.1167056669686
50.7652351622538
48.5804868844005
2018-10-02
54.179969640969
50.7993057048438
48.5804868844005
2018-10-03
54.6021709547574
50.8035639654775
48.5804868844005
2018-10-04
54.6021709547574
51.1600610997758
48.7459608850365
2018-11-01
53.4815867079232
53.8788384068764
50.8680059009101
2018-11-02
53.4012843800357
53.8545041548076
50.8680059009101
Sell
2018-11-05
52.5179537180688
53.9007386980484
50.8680059009101
2018-11-06
52.5179537180688
54.1130540704967
50.8680059009101
2018-11-07
52.5179537180688
54.2608827598324
50.9081548909462
2018-11-08
52.381683825919
54.6830840736208
51.3303562047346
2018-11-09
51.9022943297893
54.6830840736208
51.3303562047346
2018-11-12
51.312945372196
54.869846946646
51.3303562047346
2018-11-13
51.0272439215888
54.873497352104
51.3303562047346
2019-02-28
40.0868369032957
37.9514787484214
42.9921818000566
2019-03-01
40.0917199269724
37.7384198717488
42.9921818000566
2019-03-04
40.5566646362643
37.6938570296322
42.9921818000566
Buy
2019-04-23
48.1070706672322
43.6878883048808
40.3077255381675
2019-04-24
48.1965810367431
43.817865832258
40.4377030655446
2019-04-25
48.1965810367431
43.9423243081189
40.5112749854225
2019-04-26
48.1965810367431
44.0116014371635
40.7923506041967
2019-04-29
48.1965810367431
45.2089733480352
41.8874654967458
为了解决这个问题,我开始使用以下逻辑,但我无法让它正常工作。
df[df.Decision.notnull()].shift().eq('Buy').Decision
这些是决定不变的行:
rows = df['Decision'].ffill() == df['Decision'].ffill().shift(1)
将他们的决策标签转换为 NaN
:
df.loc[rows, 'Decision'] = np.nan
如果有人能帮助我将 df.Decision 列隔离为一行中的单个“买入”或“卖出”实例。例如,如果有 3 个“购买”决策,无论它们之间是否有 NaN 值分隔,我只需要保留第一个“购买”。类似的逻辑适用于“卖出”。
当前数据
Date | ColA | ColB | ColC | Decision |
---|---|---|---|---|
2018-03-21 | 41.6871345068477 | 39.1196017702354 | 39.8100609746974 | |
2018-03-22 | 41.83569164767 | 39.1196017702354 | 39.8100609746974 | Buy |
2018-04-02 | 42.0277334284587 | 39.5353679158337 | 39.8100609746974 | Buy |
2018-04-30 | 41.0131864593112 | 42.1811215382421 | 40.3090368348783 | |
2018-05-01 | 41.0131864593112 | 42.0844982888835 | 40.3090368348783 | |
2018-05-02 | 41.0131864593112 | 41.9045373766682 | 40.3090368348783 | Buy |
2018-09-28 | 54.0546533518404 | 50.7025748467743 | 48.5804868844005 | |
2018-10-01 | 54.1167056669686 | 50.7652351622538 | 48.5804868844005 | |
2018-10-02 | 54.179969640969 | 50.7993057048438 | 48.5804868844005 | Buy |
2018-10-03 | 54.6021709547574 | 50.8035639654775 | 48.5804868844005 | Buy |
2018-10-04 | 54.6021709547574 | 51.1600610997758 | 48.7459608850365 | |
2018-11-01 | 53.4815867079232 | 53.8788384068764 | 50.8680059009101 | |
2018-11-02 | 53.4012843800357 | 53.8545041548076 | 50.8680059009101 | Sell |
2018-11-05 | 52.5179537180688 | 53.9007386980484 | 50.8680059009101 | Sell |
2018-11-06 | 52.5179537180688 | 54.1130540704967 | 50.8680059009101 | Sell |
2018-11-07 | 52.5179537180688 | 54.2608827598324 | 50.9081548909462 | |
2018-11-08 | 52.381683825919 | 54.6830840736208 | 51.3303562047346 | Sell |
2018-11-09 | 51.9022943297893 | 54.6830840736208 | 51.3303562047346 | Sell |
2018-11-12 | 51.312945372196 | 54.869846946646 | 51.3303562047346 | Sell |
2018-11-13 | 51.0272439215888 | 54.873497352104 | 51.3303562047346 | Sell |
2019-02-28 | 40.0868369032957 | 37.9514787484214 | 42.9921818000566 | |
2019-03-01 | 40.0917199269724 | 37.7384198717488 | 42.9921818000566 | |
2019-03-04 | 40.5566646362643 | 37.6938570296322 | 42.9921818000566 | Buy |
2019-04-23 | 48.1070706672322 | 43.6878883048808 | 40.3077255381675 | Buy |
2019-04-24 | 48.1965810367431 | 43.817865832258 | 40.4377030655446 | |
2019-04-25 | 48.1965810367431 | 43.9423243081189 | 40.5112749854225 | |
2019-04-26 | 48.1965810367431 | 44.0116014371635 | 40.7923506041967 | Buy |
2019-04-29 | 48.1965810367431 | 45.2089733480352 | 41.8874654967458 |
预期数据
Date | ColA | ColB | ColC | Decision |
---|---|---|---|---|
2018-03-21 | 41.6871345068477 | 39.1196017702354 | 39.8100609746974 | |
2018-03-22 | 41.83569164767 | 39.1196017702354 | 39.8100609746974 | Buy |
2018-04-02 | 42.0277334284587 | 39.5353679158337 | 39.8100609746974 | |
2018-04-30 | 41.0131864593112 | 42.1811215382421 | 40.3090368348783 | |
2018-05-01 | 41.0131864593112 | 42.0844982888835 | 40.3090368348783 | |
2018-05-02 | 41.0131864593112 | 41.9045373766682 | 40.3090368348783 | |
2018-09-28 | 54.0546533518404 | 50.7025748467743 | 48.5804868844005 | |
2018-10-01 | 54.1167056669686 | 50.7652351622538 | 48.5804868844005 | |
2018-10-02 | 54.179969640969 | 50.7993057048438 | 48.5804868844005 | |
2018-10-03 | 54.6021709547574 | 50.8035639654775 | 48.5804868844005 | |
2018-10-04 | 54.6021709547574 | 51.1600610997758 | 48.7459608850365 | |
2018-11-01 | 53.4815867079232 | 53.8788384068764 | 50.8680059009101 | |
2018-11-02 | 53.4012843800357 | 53.8545041548076 | 50.8680059009101 | Sell |
2018-11-05 | 52.5179537180688 | 53.9007386980484 | 50.8680059009101 | |
2018-11-06 | 52.5179537180688 | 54.1130540704967 | 50.8680059009101 | |
2018-11-07 | 52.5179537180688 | 54.2608827598324 | 50.9081548909462 | |
2018-11-08 | 52.381683825919 | 54.6830840736208 | 51.3303562047346 | |
2018-11-09 | 51.9022943297893 | 54.6830840736208 | 51.3303562047346 | |
2018-11-12 | 51.312945372196 | 54.869846946646 | 51.3303562047346 | |
2018-11-13 | 51.0272439215888 | 54.873497352104 | 51.3303562047346 | |
2019-02-28 | 40.0868369032957 | 37.9514787484214 | 42.9921818000566 | |
2019-03-01 | 40.0917199269724 | 37.7384198717488 | 42.9921818000566 | |
2019-03-04 | 40.5566646362643 | 37.6938570296322 | 42.9921818000566 | Buy |
2019-04-23 | 48.1070706672322 | 43.6878883048808 | 40.3077255381675 | |
2019-04-24 | 48.1965810367431 | 43.817865832258 | 40.4377030655446 | |
2019-04-25 | 48.1965810367431 | 43.9423243081189 | 40.5112749854225 | |
2019-04-26 | 48.1965810367431 | 44.0116014371635 | 40.7923506041967 | |
2019-04-29 | 48.1965810367431 | 45.2089733480352 | 41.8874654967458 |
为了解决这个问题,我开始使用以下逻辑,但我无法让它正常工作。
df[df.Decision.notnull()].shift().eq('Buy').Decision
这些是决定不变的行:
rows = df['Decision'].ffill() == df['Decision'].ffill().shift(1)
将他们的决策标签转换为 NaN
:
df.loc[rows, 'Decision'] = np.nan