以重复项作为新行的 pivot df

pivot df with duplicates as new rows

晚上,我有一个要重塑的数据框。某些列有重复的 id 变量,我希望重复的值显示为新行

我的数据如下所示,我希望将 ID 作为行,将组作为列,将选项作为值。如果在一个组中为每个 id 选择了多个选项,则应如下所示复制该行。当我使用 pivot 时,我最终只会得到组合值的平均值或总和,例如11.5 为 id i1,group1。非常欢迎所有提示谢谢

import pandas as pd
import numpy as np

df = pd.DataFrame({'id': ['i1','i1','i1','i2','i2','i2','i2','i2','i3','i3'],
    'group': ['group1','group1','group2','group3','group1','group2','group2','group3','group1','group2'],
    'choice':[12,11,12,14,11,19,9,7,8,9]})
pd.DataFrame({'id': ['i1','i1','i2','i2','i3'],
              'group1': [12,11,11,np.nan,8],
              'group2': [12,np.nan,19,9,9],
              'group3':[np.nan,np.nan,14,7,np.nan]})

使用GroupBy.cumcount with Series.unstack and DataFrame.droplevel:

g = df.groupby(['id','group']).cumcount().add(1)

df = (df.set_index(['id','group', g])['choice']
        .unstack(level=1)
        .droplevel(level=1)
        .rename_axis(None,axis=1)
        .reset_index())
print (df)
   id  group1  group2  group3
0  i1    12.0    12.0     NaN
1  i1    11.0     NaN     NaN
2  i2    11.0    19.0    14.0
3  i2     NaN     9.0     7.0
4  i3     8.0     9.0     NaN