筛选出具有公共字段且至少有一个满足条件的行
Filter out rows with common field where at least one fulfills a condition
我有这样的数据:
Task
ID
Status
Task1
123
Open
Task2
123
Closed
Task3
211
Closed
Task4
211
Closed
Task5
564
Closed
Task6
994
Open
我想删除 ID 相同但状态为 'Open' 的行。换句话说,我想删除所有具有 'Open' 状态的 ID。
最终结果是这样的:
Task
ID
Status
Task3
211
Closed
Task4
211
Closed
Task5
564
Closed
数据:
{'Task': ['Task1', 'Task2', 'Task3', 'Task4', 'Task5', 'Task6'],
'ID': [123, 123, 211, 211, 564, 994],
'Status': ['Open', 'Closed', 'Closed', 'Closed', 'Closed', 'Open']}
我们可以使用打开状态和 groupby
+ cummax
创建布尔过滤器。
我们的想法是,如果一个状态是打开的,我们将它出现的所有行的相应 ID 标记为 True,然后我们过滤掉所有这样的行:
out = df[~df['Status'].eq('Open').groupby(df['ID']).cummax()]
输出:
Task ID Status
2 Task3 211 Closed
3 Task4 211 Closed
4 Task5 564 Closed
我有这样的数据:
Task | ID | Status |
---|---|---|
Task1 | 123 | Open |
Task2 | 123 | Closed |
Task3 | 211 | Closed |
Task4 | 211 | Closed |
Task5 | 564 | Closed |
Task6 | 994 | Open |
我想删除 ID 相同但状态为 'Open' 的行。换句话说,我想删除所有具有 'Open' 状态的 ID。
最终结果是这样的:
Task | ID | Status |
---|---|---|
Task3 | 211 | Closed |
Task4 | 211 | Closed |
Task5 | 564 | Closed |
数据:
{'Task': ['Task1', 'Task2', 'Task3', 'Task4', 'Task5', 'Task6'],
'ID': [123, 123, 211, 211, 564, 994],
'Status': ['Open', 'Closed', 'Closed', 'Closed', 'Closed', 'Open']}
我们可以使用打开状态和 groupby
+ cummax
创建布尔过滤器。
我们的想法是,如果一个状态是打开的,我们将它出现的所有行的相应 ID 标记为 True,然后我们过滤掉所有这样的行:
out = df[~df['Status'].eq('Open').groupby(df['ID']).cummax()]
输出:
Task ID Status
2 Task3 211 Closed
3 Task4 211 Closed
4 Task5 564 Closed