pandas if else 多列条件使用dataframe

pandas if else conditions for multiple columns using dataframe

我有数据框,我想对数据框中的字符串列值使用应用函数或 lambda 函数,以对列应用 if-else 条件。我试过 for 循环迭代

      Input Dataframe
      text1                                        output_column
     ['bread','bread','bread']                      ['bread] --> [ if count values >2 ]
     ['bread','butter','jam']                       ['butter']--> [if all 3 values are unique select 1st element value as output]
     ['bread','jam','jam']                          ['jam']--> [if count values >2]
     ['unknown']                                    ['unknown'] --> [if any of the value came as blank or null mark it as 'unknown']
     

         ################## I tried below lines of code#########

         output_column=[]
         df_value = df[['text_col1','text_col2','text_col3']].values.tolist()
          if np.all(df_value <= 1):
             output_column.append(df_value[1])
          else:
             output_column.append(max_count[np.argmax(df_value)])   


       output Dataframe
      text1                                        output_column
     ['bread','bread','bread']                      ['bread'] 
     ['bread','butter','jam']                       ['butter']
     ['bread','jam','jam']                          ['jam']
     ['unknown']                                    ['unknown']
import pandas as pd

df = pd.DataFrame({'text1': [['bread', 'bread', 'bread'],
                             ['bread', 'butter', 'jam'],
                             ['bread', 'jam', 'jam'],
                             ['unknown']]})

列表单元格不好,所以让我们explode它们:

df = df.explode('text1')

>>> df.head()
     text1
0    bread
0    bread
0    bread
1    bread
1   butter

现在您可以使用 groupby 将函数应用于每个文档(通过按索引级别 0 分组)。

启发式的细节由您决定,但这里有一些事情可以开始:

def get_values(s):
    counts = s.value_counts()
     
    if "unknown" in counts:
        return "unknown"
    
    if counts.eq(1).all():
        return s.iloc[1]

    if counts.max() >= 2:
        return counts.idxmax()

应用于各组:

>>> df.groupby(level=0).text1.apply(get_values)
0      bread
1     butter
2        jam
3    unknown
Name: text1, dtype: object