根据条件删除 Dataframe 中的行
Delete rows in Dataframe based on condition
假设我有一个数据框:
first_df = pd.DataFrame({"company" : ['abc','def','xyz','lmn','def','xyz'],
"art_type": ['300x240','100x600','400x600','300x240','100x600','400x600'],
"metrics" : ['imp','rev','cpm','imp','rev','cpm'],
"value": [1234,23,0.5,1234,23,0.5]})
first_df = first_df.append(first_df)
我想删除列表 ['lmn','xyz'] 中所有具有 company 值的行,并将其存储在另一个数据框中。
company_list = ['lmn', 'xyz']
我试过了:
deleted_data = first_df[first_df['company'] in company_list]
这显然不起作用,因为它是列表中的列表。是用 for 循环实现还是有更好的实现方式?
for 循环代码:
deleted_data = pd.DataFrame()
for x in company_list:
deleted_data = deleted_data.append(first_df[first_df['company']==x])
您可以根据 isin()
.
进行筛选
deleted_data = first_df.loc[first_df['company'].isin(company_list)]
>>> deleted_data
art_type company metrics value
2 400x600 xyz cpm 0.5
3 300x240 lmn imp 1234.0
5 400x600 xyz cpm 0.5
2 400x600 xyz cpm 0.5
3 300x240 lmn imp 1234.0
5 400x600 xyz cpm 0.5
retained_data = first_df.loc[~first_df['company'].isin(company_list)]
>>> retained_data
art_type company metrics value
0 300x240 abc imp 1234
1 100x600 def rev 23
4 100x600 def rev 23
0 300x240 abc imp 1234
1 100x600 def rev 23
4 100x600 def rev 23
假设我有一个数据框:
first_df = pd.DataFrame({"company" : ['abc','def','xyz','lmn','def','xyz'],
"art_type": ['300x240','100x600','400x600','300x240','100x600','400x600'],
"metrics" : ['imp','rev','cpm','imp','rev','cpm'],
"value": [1234,23,0.5,1234,23,0.5]})
first_df = first_df.append(first_df)
我想删除列表 ['lmn','xyz'] 中所有具有 company 值的行,并将其存储在另一个数据框中。
company_list = ['lmn', 'xyz']
我试过了:
deleted_data = first_df[first_df['company'] in company_list]
这显然不起作用,因为它是列表中的列表。是用 for 循环实现还是有更好的实现方式?
for 循环代码:
deleted_data = pd.DataFrame()
for x in company_list:
deleted_data = deleted_data.append(first_df[first_df['company']==x])
您可以根据 isin()
.
deleted_data = first_df.loc[first_df['company'].isin(company_list)]
>>> deleted_data
art_type company metrics value
2 400x600 xyz cpm 0.5
3 300x240 lmn imp 1234.0
5 400x600 xyz cpm 0.5
2 400x600 xyz cpm 0.5
3 300x240 lmn imp 1234.0
5 400x600 xyz cpm 0.5
retained_data = first_df.loc[~first_df['company'].isin(company_list)]
>>> retained_data
art_type company metrics value
0 300x240 abc imp 1234
1 100x600 def rev 23
4 100x600 def rev 23
0 300x240 abc imp 1234
1 100x600 def rev 23
4 100x600 def rev 23