如果列不为空,则将列名添加到新列
If columns not null, add column name to new column
我有一个 df:
c1 c2 c3
A None A
B A None
C None None
None None None
我正在尝试添加一个包含列名称的列 'c1-c3' 如果各个列不为空
因此生成的 df 应该是:
c1 c2 c3 c4
A None A c1|c3
B A None c1|c2
C None None c1
None None None None
构造函数
data = {'c1': ['A', 'B', 'C', None],
'c2': [None, 'A', None, None],
'c3:':['A',None,None,None]}
df = pd.DataFrame(data)
我认为这会满足您的要求:
import pandas as pd
data = {'c1': ['A', 'B', 'C', None],
'c2': [None, 'A', None, None],
'c3':['A',None,None,None]}
df = pd.DataFrame(data)
df['c4'] = df.apply(lambda x: [df.columns[i] for i in range(len(df.columns)) if x[df.columns[i]] is not None], axis=1).str.join('|')
df.loc[df['c4'] == '', 'c4'] = None
print(df)
输出:
c1 c2 c3 c4
0 A None A c1|c3
1 B A None c1|c2
2 C None None c1
3 None None None None
所以在你的情况下 dot
df['new'] = df.notna().dot(df.columns+'|').str[:-1]
Out[151]:
0 c1|c3
1 c1|c2
2 c1
3
dtype: object
不确定这是否是最优雅的方式,但它会起作用:
import re
def new_col(df):
col_value =''
if df['c1'] is None:
col_value = df_t.columns[0]
if df['c2'] is None:
col_value =col_value + '|' + df_t.columns[1]
if df['c3'] is None:
col_value =col_value + '|' + df_t.columns[2]
col_value = re.sub('^\|','',col_value )
return col_value
df['c4'] = df.apply(new_col,axis = 1)
我有一个 df:
c1 c2 c3
A None A
B A None
C None None
None None None
我正在尝试添加一个包含列名称的列 'c1-c3' 如果各个列不为空
因此生成的 df 应该是:
c1 c2 c3 c4
A None A c1|c3
B A None c1|c2
C None None c1
None None None None
构造函数
data = {'c1': ['A', 'B', 'C', None],
'c2': [None, 'A', None, None],
'c3:':['A',None,None,None]}
df = pd.DataFrame(data)
我认为这会满足您的要求:
import pandas as pd
data = {'c1': ['A', 'B', 'C', None],
'c2': [None, 'A', None, None],
'c3':['A',None,None,None]}
df = pd.DataFrame(data)
df['c4'] = df.apply(lambda x: [df.columns[i] for i in range(len(df.columns)) if x[df.columns[i]] is not None], axis=1).str.join('|')
df.loc[df['c4'] == '', 'c4'] = None
print(df)
输出:
c1 c2 c3 c4
0 A None A c1|c3
1 B A None c1|c2
2 C None None c1
3 None None None None
所以在你的情况下 dot
df['new'] = df.notna().dot(df.columns+'|').str[:-1]
Out[151]:
0 c1|c3
1 c1|c2
2 c1
3
dtype: object
不确定这是否是最优雅的方式,但它会起作用:
import re
def new_col(df):
col_value =''
if df['c1'] is None:
col_value = df_t.columns[0]
if df['c2'] is None:
col_value =col_value + '|' + df_t.columns[1]
if df['c3'] is None:
col_value =col_value + '|' + df_t.columns[2]
col_value = re.sub('^\|','',col_value )
return col_value
df['c4'] = df.apply(new_col,axis = 1)