如何根据 python 的条件 return 新列中列的值

Question

我有一个包含三列的数据框

 a b c
[1,0,2] 
[0,3,2] 
[0,0,2]

并且需要根据层次结构创建第四列，如下所示：

如果 a 列有值，则 d 列 = a 列

如果 a 列没有值但 b 列有那么 d 列 = b 列

如果 a 列和 b 列没有值，但 c 列有 d 列 = c 列

 a b c d
[1,0,2,1] 
[0,3,2,3] 
[0,0,2,2]

我是 python 的新手，不知道从哪里开始。

编辑：我尝试了以下方法，但如果 a 列为空或 None

，它们都不会 return d 列中的值

df['d'] = df['a']
df.loc[df['a'] == 0, 'd'] = df['b']
df.loc[~df['a'].astype('bool') &  ~df['b'].astype('bool'), 'd'] = df['c']

df['d'] = df['a']
df.loc[df['a'] == None, 'd'] = df['b']
df.loc[~df['a'].astype('bool') &  ~df['b'].astype('bool'), 'd'] = df['c']

df['d']=np.where(df.a!=0, df.a,\
                                          np.where(df.b!=0,\
                                                   df.b, df.c)

Answer 1

试试这个（df 是你的数据框）

df['d']=np.where(df.a!=0 and df.a is not None, df.a, np.where(df.b!=0 and df.b is not None, df.b, df.c))

>>> print(df)
   a  b  c  d
0  1  0  2  1
1  0  3  2  3
2  0  0  2  2

Answer 2

import numpy as np
import pandas as pd

df = pd.DataFrame([[1,0,2], [0,3,2], [0,0,2]], columns = ('a','b','c'))
print(df)

df['d'] = df['a']
df.loc[df['a'] == 0, 'd'] = df['b']
df.loc[~df['a'].astype('bool') &  ~df['b'].astype('bool'), 'd'] = df['c']
print(df)

Answer 3

一个简单的 one-liner 就是，

df['d'] = df.replace(0, np.nan).bfill(axis=1)['a'].astype(int)

逐步可视化

不将值转换为 NaN

     a    b  c
0  1.0  NaN  2
1  NaN  3.0  2
2  NaN  NaN  2

现在向后填充沿行的值

     a    b    c
0  1.0  2.0  2.0
1  3.0  3.0  2.0
2  2.0  2.0  2.0

现在 select 所需的列，即 'a' 并创建一个新列 'd'

输出

   a  b  c  d
0  1  0  2  1
1  0  3  2  3
2  0  0  2  2

如何根据 python 的条件 return 新列中列的值

How can I return the value of a column in a new column based on conditions with python

python

if-statement

hierarchy

calculated-columns

pandas

逐步可视化