将多列合并为一列
Combining multiple columsn into one
我有以下数据框:
Column1 Column2 Column3 Column4 Column5
0 value1 x1 y1 na na
1 value2 x2 y2 na na
2 value3 x3 na z1 na
3 value4 x4 na z2 na
4 value5 x5 na na w1
我想要以下
Column1 Column2 Column
0 value1 x1 y1
1 value2 x2 y2
2 value3 x3 z1
3 value4 x4 z2
4 value5 x5 w1
我怎样才能做到这一点? stack() 似乎不适用于此任务。
如有任何帮助,我将不胜感激。
将初始列设置为索引,然后在轴 1 和 select 第一列上回填:
cols = ['Column1','Column2']
out = df.mask(df.eq('na')).set_index(cols).bfill(axis=1,).iloc[:,0].reset_index()
print(out)
Column1 Column2 Column3
0 value1 x1 y1
1 value2 x2 y2
2 value3 x3 z1
3 value4 x4 z2
4 value5 x5 w1
new_column = pd.Series()
for col in ["Column3", "Column4", "Column5"]:
new_column = pd.concat([new_column, df[col].dropna()])
df = df.drop(col, axis=1)
df["Column3"] = new_column
>>> df
Column1 Column2 Column3
0 value1 x1 y1
1 value2 x2 y2
2 value3 x3 z1
3 value4 x4 z2
4 value5 x5 w1
一个选项是使用 pyjanitor
中的 coalesce
来抽象流程(在幕后,它只是 bfill/ffill):
# pip install pyjanitor
import pandas as pd
import janitor
df.coalesce('Column3', 'Column4', 'Column5').dropna(axis=1)
Column1 Column2 Column3
0 value1 x1 y1
1 value2 x2 y2
2 value3 x3 z1
3 value4 x4 z2
4 value5 x5 w1
我有以下数据框:
Column1 Column2 Column3 Column4 Column5
0 value1 x1 y1 na na
1 value2 x2 y2 na na
2 value3 x3 na z1 na
3 value4 x4 na z2 na
4 value5 x5 na na w1
我想要以下
Column1 Column2 Column
0 value1 x1 y1
1 value2 x2 y2
2 value3 x3 z1
3 value4 x4 z2
4 value5 x5 w1
我怎样才能做到这一点? stack() 似乎不适用于此任务。
如有任何帮助,我将不胜感激。
将初始列设置为索引,然后在轴 1 和 select 第一列上回填:
cols = ['Column1','Column2']
out = df.mask(df.eq('na')).set_index(cols).bfill(axis=1,).iloc[:,0].reset_index()
print(out)
Column1 Column2 Column3
0 value1 x1 y1
1 value2 x2 y2
2 value3 x3 z1
3 value4 x4 z2
4 value5 x5 w1
new_column = pd.Series()
for col in ["Column3", "Column4", "Column5"]:
new_column = pd.concat([new_column, df[col].dropna()])
df = df.drop(col, axis=1)
df["Column3"] = new_column
>>> df
Column1 Column2 Column3
0 value1 x1 y1
1 value2 x2 y2
2 value3 x3 z1
3 value4 x4 z2
4 value5 x5 w1
一个选项是使用 pyjanitor
中的 coalesce
来抽象流程(在幕后,它只是 bfill/ffill):
# pip install pyjanitor
import pandas as pd
import janitor
df.coalesce('Column3', 'Column4', 'Column5').dropna(axis=1)
Column1 Column2 Column3
0 value1 x1 y1
1 value2 x2 y2
2 value3 x3 z1
3 value4 x4 z2
4 value5 x5 w1