通过在找到匹配项时创建新列来分配新词
assigning new word by creating a new column when finding match
我有一个调查 df,我想分配一个新值 "existing customer"
或 "new customer based" 他们的回答。例如,如果某人有 3 个答案,但其中一个
他们匹配 "coca cola" 我想给他们现有客户的价值
这是数据框:
ID Question Answer
101005 what brands did you purchase the past 5 months Coca-Cola or Pepsi or vitamin water
026458 what brands did you purchase the past 5 months None
045987 what brands did you purchase the past 5 months Coca-Cola
这是table我想要的
ID Question Answer Buyer_Type
101005 what brands did you purchase the past 5 months Coca-Cola,Pepsi,fanta Existing Users
026458 what brands did you purchase the past 5 months None New Buyer
045987 what brands did you purchase the past 5 months Coca-Cola Existing Users
我试过这个代码,但出于某种原因它显示 101005 为新买家,即使这个 id 说他们过去购买过可口可乐:
deux['Buyer_Type'] = deux['answer'].apply(lambda x:'existing buyer' if x == 'Coca-Cola' else 'new buyer')
由于某些原因,它没有将 101005 识别为现有用户
补充一下@Quang Hoang 的评论,添加 case=False
和 coca
和 cola
的两个条件将有助于解决方案对不同类型的值更加灵活Answer
列如示例所示:
df = pd.DataFrame({'ID':[1,2,3,4],'Answer':['Coca-Cola',None,'coca-cola','cocaCola']})
df['Buyer_Type'] = np.where(df['Answer'].str.contains('coca',case=False) & df['Answer'].str.contains('cola',case=False),
"Existing user","New buyer")
输出:
ID Answer Buyer_Type
0 1 Coca-Cola Existing user
1 2 None New buyer
2 3 coca-cola Existing user
3 4 cocaCola Existing user
我有一个调查 df,我想分配一个新值 "existing customer" 或 "new customer based" 他们的回答。例如,如果某人有 3 个答案,但其中一个 他们匹配 "coca cola" 我想给他们现有客户的价值 这是数据框:
ID Question Answer
101005 what brands did you purchase the past 5 months Coca-Cola or Pepsi or vitamin water
026458 what brands did you purchase the past 5 months None
045987 what brands did you purchase the past 5 months Coca-Cola
这是table我想要的
ID Question Answer Buyer_Type
101005 what brands did you purchase the past 5 months Coca-Cola,Pepsi,fanta Existing Users
026458 what brands did you purchase the past 5 months None New Buyer
045987 what brands did you purchase the past 5 months Coca-Cola Existing Users
我试过这个代码,但出于某种原因它显示 101005 为新买家,即使这个 id 说他们过去购买过可口可乐:
deux['Buyer_Type'] = deux['answer'].apply(lambda x:'existing buyer' if x == 'Coca-Cola' else 'new buyer')
由于某些原因,它没有将 101005 识别为现有用户
补充一下@Quang Hoang 的评论,添加 case=False
和 coca
和 cola
的两个条件将有助于解决方案对不同类型的值更加灵活Answer
列如示例所示:
df = pd.DataFrame({'ID':[1,2,3,4],'Answer':['Coca-Cola',None,'coca-cola','cocaCola']})
df['Buyer_Type'] = np.where(df['Answer'].str.contains('coca',case=False) & df['Answer'].str.contains('cola',case=False),
"Existing user","New buyer")
输出:
ID Answer Buyer_Type
0 1 Coca-Cola Existing user
1 2 None New buyer
2 3 coca-cola Existing user
3 4 cocaCola Existing user