pandas ix 或 iloc 多条件语法是什么

what is the pandas ix or iloc multiple condition syntax

正在处理政府外国资产控制办公室 (OFAC) 列表

https://www.treasury.gov/ofac/downloads/sdn.csv

第 2 列(从 0 开始的数字)表示此行是个人、企业 (-0-)、飞机还是船舶信息

如果我想让单独的列都等于 'individual' 和“-0-”,正确的语法是什么?以下代码仅适用于等于 'individual'

的单个列
name_orig = pd.read_csv('http://www.treasury.gov/ofac/downloads/sdn.csv', sep=',', header=None)

name_orig.rename(columns={0: 'id', 1: 'names', 2: 'individual', 11: 'sdn_info'}, inplace=True)

names = name_orig.ix[name_orig.individual == 'individual', ['id', 'names', 'individual', 'sdn_info']]

这似乎行不通

names = name_orig.ix[name_orig.individual == 'individual' | name_orig.individual == '-0-' , ['id', 'names', 'individual', 'sdn_info']]

掩码中缺少括号:

names = name_orig.ix[(name_orig.individual == 'individual') | (name_orig.individual == '-0-'), ['id', 'names', 'individual', 'sdn_info']]

或最新版本:

names = name_orig.loc[(name_orig.individual == 'individual') | (name_orig.individual == '-0-'), ['id', 'names', 'individual', 'sdn_info']]

有一个loc/iloc,它会给你想要的结果:

names = name_orig[['id', 'names', 'individual', 'sdn_info']].loc[(name_orig['individual'] == 'individual') | (name_orig['individual'] == '-0-')]

显然如果你 运行

name_orig.individual.unique()

输出:

array(['-0- ', 'individual', 'vessel', 'aircraft', nan], dtype=object)

-0- 多了一个 space。 我认为这会起作用:

names = name_orig.ix[((name_orig.individual == 'individual') | (name_orig.individual == '-0- ')), ['id', 'names', 'individual', 'sdn_info']]