如何使用 pandas 转换 df
How do I transform df using pandas
我有以下数据框片段:
predictor b z pvalue model ss
age_raw 1.026695 4.678675 2.89E-06 composite_outcome_nonenglish 1
elixsum 1.228125 8.514571 1.67E-17 composite_outcome_nonenglish 1
age_raw 1.087716 0.228507 0.819252 composite_outcome_english 0
elixsum 1.760882 1.68492 0.092004 composite_outcome_english 0
我需要使用 model
作为最高级别和 (b
, z
, pvalue
, ss
) 作为二级 predictor
作为行:
model model
composite_outcome_nonenglish composite_outcome_english
b z pvalue ss b z pvalue ss
age_raw 1.026695 4.678675 2.89E-06 0 1.087716 0.228507 0.819252 1
elixsum 1.228125 8.514571 1.67E-17 0 1.760882 1.68492 0.092004 1
我尝试了各种分组、堆叠和拆堆叠等方法,但我终究无法做到这一点。
你想要 pivot
和 swaplevel
:
(df.pivot(index='predictor', columns='model')
.swaplevel(0,1, axis=1)
.sort_index(axis=1)
)
输出:
model composite_outcome_english composite_outcome_nonenglish
b pvalue ss z b pvalue ss z
predictor
age_raw 1.087716 0.819252 0 0.228507 1.026695 2.890000e-06 1 4.678675
elixsum 1.760882 0.092004 0 1.684920 1.228125 1.670000e-17 1 8.514571
设置索引 unstack
+ stack
out = df.set_index(['predictor','model']).stack().unstack(level=[1,2])
Out[366]:
model composite_outcome_nonenglish ... composite_outcome_english
b z ... pvalue ss
predictor ...
age_raw 1.026695 4.678675 ... 0.819252 0.0
elixsum 1.228125 8.514571 ... 0.092004 0.0
我有以下数据框片段:
predictor b z pvalue model ss
age_raw 1.026695 4.678675 2.89E-06 composite_outcome_nonenglish 1
elixsum 1.228125 8.514571 1.67E-17 composite_outcome_nonenglish 1
age_raw 1.087716 0.228507 0.819252 composite_outcome_english 0
elixsum 1.760882 1.68492 0.092004 composite_outcome_english 0
我需要使用 model
作为最高级别和 (b
, z
, pvalue
, ss
) 作为二级 predictor
作为行:
model model
composite_outcome_nonenglish composite_outcome_english
b z pvalue ss b z pvalue ss
age_raw 1.026695 4.678675 2.89E-06 0 1.087716 0.228507 0.819252 1
elixsum 1.228125 8.514571 1.67E-17 0 1.760882 1.68492 0.092004 1
我尝试了各种分组、堆叠和拆堆叠等方法,但我终究无法做到这一点。
你想要 pivot
和 swaplevel
:
(df.pivot(index='predictor', columns='model')
.swaplevel(0,1, axis=1)
.sort_index(axis=1)
)
输出:
model composite_outcome_english composite_outcome_nonenglish
b pvalue ss z b pvalue ss z
predictor
age_raw 1.087716 0.819252 0 0.228507 1.026695 2.890000e-06 1 4.678675
elixsum 1.760882 0.092004 0 1.684920 1.228125 1.670000e-17 1 8.514571
设置索引 unstack
+ stack
out = df.set_index(['predictor','model']).stack().unstack(level=[1,2])
Out[366]:
model composite_outcome_nonenglish ... composite_outcome_english
b z ... pvalue ss
predictor ...
age_raw 1.026695 4.678675 ... 0.819252 0.0
elixsum 1.228125 8.514571 ... 0.092004 0.0