通过在找到匹配项时创建新列来分配新词

assigning new word by creating a new column when finding match

我有一个调查 df,我想分配一个新值 "existing customer" 或 "new customer based" 他们的回答。例如,如果某人有 3 个答案,但其中一个 他们匹配 "coca cola" 我想给他们现有客户的价值 这是数据框:

 ID       Question                                                Answer
101005   what brands did you purchase the past 5 months   Coca-Cola or Pepsi or vitamin water
026458   what brands did you purchase the past 5 months           None
045987   what brands  did you purchase the past 5 months        Coca-Cola

这是table我想要的

ID        Question                                          Answer                      Buyer_Type

101005   what brands did you purchase the past 5 months  Coca-Cola,Pepsi,fanta          Existing Users          
026458   what brands did you purchase the past 5 months  None                           New Buyer              
045987   what brands did you purchase the past 5 months  Coca-Cola                      Existing Users

我试过这个代码,但出于某种原因它显示 101005 为新买家,即使这个 id 说他们过去购买过可口可乐:

deux['Buyer_Type'] = deux['answer'].apply(lambda x:'existing buyer' if x == 'Coca-Cola' else 'new buyer') 

由于某些原因,它没有将 101005 识别为现有用户

补充一下@Quang Hoang 的评论,添加 case=Falsecocacola 的两个条件将有助于解决方案对不同类型的值更加灵活Answer列如示例所示:

df = pd.DataFrame({'ID':[1,2,3,4],'Answer':['Coca-Cola',None,'coca-cola','cocaCola']})
df['Buyer_Type'] = np.where(df['Answer'].str.contains('coca',case=False) & df['Answer'].str.contains('cola',case=False),
                            "Existing user","New buyer")

输出:

   ID     Answer     Buyer_Type
0   1  Coca-Cola  Existing user
1   2       None      New buyer
2   3  coca-cola  Existing user
3   4   cocaCola  Existing user