根据现有条件删除 DataFrame 行

Drop DataFrame Row Based on Existing Condition

给定以下 pandas df -

Holding Account Account Type Column A Column B
Rupert 06 (23938996) Holding Account 1825973 1702598
Rupert 07 (23938996) Holding Account 1697870 1825973
- - - -
Caroline 06 (0131465) Holding Account 11112222 5435450
Caroline 07 (0131465) Holding Account 7896545 11112222

我一直在努力寻找一种方法来执行以下操作 -

这意味着 pandas df 现在看起来如下 -

Holding Account Account Type Column A Column B
Rupert 06 (23938996) Holding Account 1825973 1702598
Rupert 07 (23938996) Holding Account 1697870 1702598
- - - -
Caroline 06 (0131465) Holding Account 11112222 5435450
Caroline 07 (0131465) Holding Account 7896545 5435450

代码实现:下面的代码实现步骤1和2 -

import numpy as np
df['Column B'] = np.where(df['Column B'].isin(df['Column A'].values),df['Column B'].shift(),df['Column B'])

我需要帮助的地方:我想扩展代码,包括以下内容:

Holding Account Account Type Column A Column B
Rupert 07 (23938996) Holding Account 1697870 1702598
Caroline 07 (0131465) Holding Account 7896545 5435450

有谁知道如何适当扩展代码?

不使用 np.where,只需计算一些掩码:

rows_to_remove = df['Column A'].isin(df['Column B'])
df.loc[df['Column B'].isin(df['Column A'].values), 'Column B'] = df.loc[rows_to_remove, 'Column B'].to_numpy()
df = df[~rows_to_remove]

输出:

>>> df
         Holding Account     Account Type  Column A  Column B
1   Rupert 07 (23938996)  Holding Account   1697870   1702598
3  Caroline 07 (0131465)  Holding Account   7896545   5435450