根据 DF 中更改的不同字符串列停止前向填充

Stopping a forward fill based on a different string column changing in the DF

我是 Python 的新手,所以放轻松!

我有一个如下所示的数据框。我想向前填充 shares_owned 列中的 NaN,但当 df['ticker'] 中的字符串更改时停止。并且仅当另一个数字再次出现在 shares_owned 中时才开始。

date ticker shares_owned price
01/01/2020 EZY NaN £2
02/01/2020 EZY 10 £2.1
03/01/2020 EZY NaN £2.12
04/01/2020 EZY NaN £12.5
01/01/2020 FTSE NaN £11
02/01/2020 FTSE NaN £12
03/01/2020 FTSE 2 £12.5
04/01/2020 FTSE NaN £12.5

例如,输出 table 将如下所示:

date ticker shares_owned price
01/01/2020 EZY NaN £2
02/01/2020 EZY 10 £2.1
03/01/2020 EZY 10 £2.12
04/01/2020 EZY 10 £12.5
01/01/2020 FTSE NaN £11
02/01/2020 FTSE NaN £12
03/01/2020 FTSE 2 £12.5
04/01/2020 FTSE 2 £12.5

到目前为止,我一直在尝试使用 .fillna(method='ffill') 无济于事。

  • 你注意到分组,因此 groupby() 进行分组
  • 组内fillna(method="fill")组内transform()
df = pd.read_csv(io.StringIO("""date    ticker  shares_owned    price
01/01/2020  EZY NaN £2
02/01/2020  EZY 10  £2.1
03/01/2020  EZY NaN £2.12
04/01/2020  EZY NaN £12.5
01/01/2020  FTSE    NaN £11
02/01/2020  FTSE    NaN £12
03/01/2020  FTSE    2   £12.5
04/01/2020  FTSE    NaN £12.5"""), sep="\t")

df["shares_owned"] = df.groupby("ticker")["shares_owned"].transform(lambda s: s.fillna(method="ffill"))

输出

date ticker shares_owned price
0 01/01/2020 EZY nan £2
1 02/01/2020 EZY 10 £2.1
2 03/01/2020 EZY 10 £2.12
3 04/01/2020 EZY 10 £12.5
4 01/01/2020 FTSE nan £11
5 02/01/2020 FTSE nan £12
6 03/01/2020 FTSE 2 £12.5
7 04/01/2020 FTSE 2 £12.5