在多索引数据框中基于 NaN 创建新列 (Python)

Question

我有以下多索引数据框，我想创建一个新列来告诉我一家公司是否仍然存在。 ID 和 Year 是多索引的一部分。

 id  Year  Profit/Loss Total Sales  
 0   2008  300.        2000.        
 0   2009  400.        2000.       
 0   2010  500.        2000.       
 0   2011  NaN         NaN       
 0   2012  NaN         NaN   
 1   2008  300.        2000.       
 1   2009  300.        2000.

我需要一个函数来检查特定年份的两列（总销售额和 Profit/Loss）是否为 NaN，以及它们是否为 return 偿付能力列中的 0。如果其中之一或两者都有值，那么它应该 return a 1.

期望的输出：

 id  Year  Profit/Loss Total Sales  Solvency
 0   2008  300.        2000.        1
 0   2009  400.        2000.        1
 0   2010  500.        2000.        1
 0   2011  NaN         NaN          0
 0   2012  NaN         NaN          0
 1   2008  300.        2000.        1
 1   2009  300.        2000.        1

Answer 1

您可以使用 notna to identify the non-NaN columns, aggregate to boolean if any is True per row and convert to integer with astype:

df['Solvency'] = df[['Profit/Loss', 'Total Sales']].notna().any(1).astype(int)

输出：

   id  Year  Profit/Loss  Total Sales  Solvency
0   0  2008        300.0       2000.0         1
1   0  2009        400.0       2000.0         1
2   0  2010        500.0       2000.0         1
3   0  2011          NaN          NaN         0
4   0  2012          NaN          NaN         0
5   1  2008        300.0       2000.0         1
6   1  2009        300.0       2000.0         1

在多索引数据框中基于 NaN 创建新列 (Python)

Creating new column based on NaNs in multi-index data frame (Python)

python

multi-index

pandas