如果列不为空,则将列名添加到新列

If columns not null, add column name to new column

我有一个 df:

c1   c2   c3 
 A   None  A
 B   A    None
 C   None None
 None None None

我正在尝试添加一个包含列名称的列 'c1-c3' 如果各个列不为空

因此生成的 df 应该是:

c1   c2   c3   c4
 A   None  A    c1|c3
 B   A    None  c1|c2
 C   None None  c1
 None None None None

构造函数

data = {'c1': ['A', 'B', 'C', None],
        'c2': [None, 'A', None, None],
        'c3:':['A',None,None,None]}

df = pd.DataFrame(data)

我认为这会满足您的要求:

import pandas as pd
data = {'c1': ['A', 'B', 'C', None],
        'c2': [None, 'A', None, None],
        'c3':['A',None,None,None]}

df = pd.DataFrame(data)

df['c4'] = df.apply(lambda x: [df.columns[i] for i in range(len(df.columns)) if x[df.columns[i]] is not None], axis=1).str.join('|')
df.loc[df['c4'] == '', 'c4'] = None
print(df)

输出:

     c1    c2    c3     c4
0     A  None     A  c1|c3
1     B     A  None  c1|c2
2     C  None  None     c1
3  None  None  None   None

所以在你的情况下 dot

df['new'] = df.notna().dot(df.columns+'|').str[:-1]
Out[151]: 
0     c1|c3
1     c1|c2
2        c1
3          
dtype: object

不确定这是否是最优雅的方式,但它会起作用:

import re

def new_col(df):
    col_value =''
    if df['c1'] is None:
        col_value = df_t.columns[0]
    if df['c2'] is None:
        col_value =col_value + '|' + df_t.columns[1]
    if df['c3'] is None:
        col_value =col_value + '|' + df_t.columns[2]
    col_value = re.sub('^\|','',col_value )
    return col_value
    
df['c4'] = df.apply(new_col,axis = 1)