Reshape DataFrame Pandas - 一些变量变长,另一些变量变宽
Reshape DataFrame Pandas - some variables to long others to wide
我需要重塑数据框,以便某些变量(Diag1、Diag2、Diag3)变长,而其他变量(句点)变宽。基本上他们需要交换位置。
我在下面的示例中重新创建了原始数据框。如示例所示,我已尝试分别使用 pivot 和 melt 无济于事。
df = pd.DataFrame({
'ID':[1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,],
'Period':['0 Month','3 Month','0 Month','3 Month','0 Month',
'3 Month','0 Month','3 Month','0 Month','3 Month','0 Month',
'3 Month','0 Month','3 Month','0 Month','3 Month','0 Month',
'3 Month','0 Month','3 Month','0 Month','3 Month',],
'Diag1':[0,1,1,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,],
'Diag2':[0,0,1,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,],
'Diag3':[0,0,1,0,0,0,1,1,0,0,1,0,0,0,1,0,0,0,1,0,1,0,]
})
dfp = df.pivot(index=["ID",], columns='Period',).reset_index()
print(dfp)
dfm = df.melt(id_vars=["ID",],value_vars=['Period'])
print(dfm)
期望的结果是:
ID Diagnosis 0_Month 3_Month
1 Diag1 0 1
1 Diag2 0 0
1 Diag3 0 0
2 Diag1 0 1
2 Diag2 1 0
2 Diag3 1 0
3 Diag1
3 Diag2
3 Diag3 etc...
我怀疑我需要两者的某种组合,但正在努力寻找任何示例。结果我的大脑开始融化了...
你可以melt
;然后 pivot
:
out = (df.melt(id_vars=['ID', 'Period'], var_name='Diagnosis')
.pivot(['ID','Diagnosis'], 'Period', 'value')
.reset_index().rename_axis(columns=[None]))
输出:
ID Diagnosis 0 Month 3 Month
0 1 Diag1 0 1
1 1 Diag2 0 0
2 1 Diag3 0 0
3 2 Diag1 1 0
4 2 Diag2 1 0
.. .. ... ... ...
28 10 Diag2 1 0
29 10 Diag3 1 0
30 11 Diag1 0 0
31 11 Diag2 0 0
32 11 Diag3 1 0
[33 rows x 4 columns]
我需要重塑数据框,以便某些变量(Diag1、Diag2、Diag3)变长,而其他变量(句点)变宽。基本上他们需要交换位置。
我在下面的示例中重新创建了原始数据框。如示例所示,我已尝试分别使用 pivot 和 melt 无济于事。
df = pd.DataFrame({
'ID':[1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,],
'Period':['0 Month','3 Month','0 Month','3 Month','0 Month',
'3 Month','0 Month','3 Month','0 Month','3 Month','0 Month',
'3 Month','0 Month','3 Month','0 Month','3 Month','0 Month',
'3 Month','0 Month','3 Month','0 Month','3 Month',],
'Diag1':[0,1,1,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,],
'Diag2':[0,0,1,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,],
'Diag3':[0,0,1,0,0,0,1,1,0,0,1,0,0,0,1,0,0,0,1,0,1,0,]
})
dfp = df.pivot(index=["ID",], columns='Period',).reset_index()
print(dfp)
dfm = df.melt(id_vars=["ID",],value_vars=['Period'])
print(dfm)
期望的结果是:
ID Diagnosis 0_Month 3_Month
1 Diag1 0 1
1 Diag2 0 0
1 Diag3 0 0
2 Diag1 0 1
2 Diag2 1 0
2 Diag3 1 0
3 Diag1
3 Diag2
3 Diag3 etc...
我怀疑我需要两者的某种组合,但正在努力寻找任何示例。结果我的大脑开始融化了...
你可以melt
;然后 pivot
:
out = (df.melt(id_vars=['ID', 'Period'], var_name='Diagnosis')
.pivot(['ID','Diagnosis'], 'Period', 'value')
.reset_index().rename_axis(columns=[None]))
输出:
ID Diagnosis 0 Month 3 Month
0 1 Diag1 0 1
1 1 Diag2 0 0
2 1 Diag3 0 0
3 2 Diag1 1 0
4 2 Diag2 1 0
.. .. ... ... ...
28 10 Diag2 1 0
29 10 Diag3 1 0
30 11 Diag1 0 0
31 11 Diag2 0 0
32 11 Diag3 1 0
[33 rows x 4 columns]