Pandas 数据框枢轴 table 和分组

Pandas dataframe pivot table and grouping

我有一个 DataFrame,我将其制作成数据透视表 table,但现在我想对数据透视表 table 进行排序,以便基于特定列的公共值彼此对齐。例如订购 DataFrame 以便所有常见国家/地区对齐同一行:

data = {'dt': ['2016-08-22', '2016-08-21', '2016-08-22', '2016-08-21', '2016-08-21'],
        'country':['uk', 'usa', 'fr','fr','uk'],
        'number': [10, 21, 20, 10,12]
        }

df = pd.DataFrame(data)
print df

  country          dt  number
0      uk  2016-08-22      10
1     usa  2016-08-21      21
2      fr  2016-08-22      20
3      fr  2016-08-21      10
4      uk  2016-08-21      12


#pivot table by dt:

df['idx'] = df.groupby('dt')['dt'].cumcount()
df_pivot = df.set_index(['idx','dt']).stack().unstack([1,2])
print df_pivot
dt       2016-08-22        2016-08-21       
       country number    country number
idx                                    
0           uk     10        usa     21
1           fr     20         fr     10
2          NaN    NaN         uk     12


#what I really want:

        dt    2016-08-22   2016-08-21       
       country number    country number

0           uk     10         uk     12
1           fr     20         fr     10
2          NaN    NaN        usa     21

甚至更好:

               2016-08-22   2016-08-21       
       country  number       number

0           uk     10         12
1           fr     20         10
2          usa    NaN         21

uk 来自 2016-08-222016-08-21 的值在同一行上对齐

您可以使用:

df_pivot = df.set_index(['dt','country']).stack().unstack([0,2]).reset_index()
print (df_pivot)
dt country 2016-08-22 2016-08-21
               number     number
0       fr       20.0       10.0
1       uk       10.0       12.0
2      usa        NaN       21.0  

#change first value of Multiindex from first to second level
cols  = [col for col in df_pivot.columns]
df_pivot.columns = pd.MultiIndex.from_tuples([('','country')] + cols[1:])
print (df_pivot)
          2016-08-22 2016-08-21
  country     number     number
0      fr       20.0       10.0
1      uk       10.0       12.0
2     usa        NaN       21.0

另一个更简单的解决方案是 pivot:

df_pivot = df.pivot(index='country', columns='dt', values='number')
print (df_pivot)
dt       2016-08-21  2016-08-22
country                        
fr             10.0        20.0
uk             12.0        10.0
usa            21.0         NaN