在多索引数据框中基于 NaN 创建新列 (Python)
Creating new column based on NaNs in multi-index data frame (Python)
我有以下多索引数据框,我想创建一个新列来告诉我一家公司是否仍然存在。 ID 和 Year 是多索引的一部分。
id Year Profit/Loss Total Sales
0 2008 300. 2000.
0 2009 400. 2000.
0 2010 500. 2000.
0 2011 NaN NaN
0 2012 NaN NaN
1 2008 300. 2000.
1 2009 300. 2000.
我需要一个函数来检查特定年份的两列(总销售额和 Profit/Loss)是否为 NaN,以及它们是否为 return 偿付能力列中的 0。如果其中之一或两者都有值,那么它应该 return a 1.
期望的输出:
id Year Profit/Loss Total Sales Solvency
0 2008 300. 2000. 1
0 2009 400. 2000. 1
0 2010 500. 2000. 1
0 2011 NaN NaN 0
0 2012 NaN NaN 0
1 2008 300. 2000. 1
1 2009 300. 2000. 1
您可以使用 notna
to identify the non-NaN columns, aggregate to boolean if any
is True per row and convert to integer with astype
:
df['Solvency'] = df[['Profit/Loss', 'Total Sales']].notna().any(1).astype(int)
输出:
id Year Profit/Loss Total Sales Solvency
0 0 2008 300.0 2000.0 1
1 0 2009 400.0 2000.0 1
2 0 2010 500.0 2000.0 1
3 0 2011 NaN NaN 0
4 0 2012 NaN NaN 0
5 1 2008 300.0 2000.0 1
6 1 2009 300.0 2000.0 1
我有以下多索引数据框,我想创建一个新列来告诉我一家公司是否仍然存在。 ID 和 Year 是多索引的一部分。
id Year Profit/Loss Total Sales
0 2008 300. 2000.
0 2009 400. 2000.
0 2010 500. 2000.
0 2011 NaN NaN
0 2012 NaN NaN
1 2008 300. 2000.
1 2009 300. 2000.
我需要一个函数来检查特定年份的两列(总销售额和 Profit/Loss)是否为 NaN,以及它们是否为 return 偿付能力列中的 0。如果其中之一或两者都有值,那么它应该 return a 1.
期望的输出:
id Year Profit/Loss Total Sales Solvency
0 2008 300. 2000. 1
0 2009 400. 2000. 1
0 2010 500. 2000. 1
0 2011 NaN NaN 0
0 2012 NaN NaN 0
1 2008 300. 2000. 1
1 2009 300. 2000. 1
您可以使用 notna
to identify the non-NaN columns, aggregate to boolean if any
is True per row and convert to integer with astype
:
df['Solvency'] = df[['Profit/Loss', 'Total Sales']].notna().any(1).astype(int)
输出:
id Year Profit/Loss Total Sales Solvency
0 0 2008 300.0 2000.0 1
1 0 2009 400.0 2000.0 1
2 0 2010 500.0 2000.0 1
3 0 2011 NaN NaN 0
4 0 2012 NaN NaN 0
5 1 2008 300.0 2000.0 1
6 1 2009 300.0 2000.0 1