使用 loc 或 iloc 编写用于过滤数据的多个条件的有效方法

Efficient way of writing multiple conditions for filtering data using loc or iloc

我编写了如下代码来过滤掉名为 'Document Type' 的列中的记录,该列包含大约 25 个分类值。

salesdf.loc[(salesdf['Document type'] != 'AVC') & 
(salesdf['Document type'] != 'CC') & 
(salesdf['Document type'] != 'CDI') & 
(salesdf['Document type'] != 'BSX') & 
(salesdf['Document type'] != 'BTR') & 
(salesdf['Document type'] != 'FAF')] 

我只是想知道是否有一种有效的代码编写方式可以提供相同的输出?

我认为你需要 isin 条件为 ~:

salesdf[~salesdf['Document type'].isin(['AVC', 'CC','CDI', 'BSX','BTR','FAF'])]

示例:

salesdf = pd.DataFrame({
    'Document type': ['AVC','CDI','CC','a','b','FAF','BTR','c','BSX']
})
print (salesdf)
  Document type
0           AVC
1           CDI
2            CC
3             a
4             b
5           FAF
6           BTR
7             c
8           BSX

a = salesdf.loc[(salesdf['Document type'] != 'AVC') & 
(salesdf['Document type'] != 'CC') & 
(salesdf['Document type'] != 'CDI') & 
(salesdf['Document type'] != 'BSX') & 
(salesdf['Document type'] != 'BTR') & 
(salesdf['Document type'] != 'FAF')] 

print (a)
  Document type
3             a
4             b
7             c

b = salesdf[~salesdf['Document type'].isin(['AVC', 'CC','CDI', 'BSX','BTR','FAF'])]
print (b)
  Document type
3             a
4             b
7             c

我会用.isin()和一个否定:

toIgnore = ['AVC', 'CC', 'CDI', 'BSX', 'BTR', 'FAF']
salesdf[~salesdf['Document type'].isin(toIgnore)]