重塑堆叠 Pandas DataFrame
Reshape stacked Pandas DataFrame
我有以下 DataFrame df1
:
df1 = pd.DataFrame(np.random.rand(4,2), columns = {"var1", "var2"})
df1["inst"] = ["A", "A", "B", "B"]
df1.set_index("inst", inplace = True)
df1 = df1.stack()
ipdb> df1
inst
A var1 0.191094
var2 0.100821
var1 0.251331
var2 0.528787
B var1 0.806549
var2 0.638217
var1 0.233541
var2 0.905737
我想重塑 df1
这样
ipdb> df1
A B
var1 0.191094 0.806549
var2 0.100821 0.638217
var1 0.251331 0.233541
var2 0.528787 0.905737
我尝试获取 df1
的值并使用 reshape
函数重塑它们,但没有成功:
ipdb> df1.values
array([ 0.19109431, 0.10082081, 0.25133097, 0.52878702,
0.80654863, 0.63821703, 0.23354052, 0.90573699])
ipdb> df1.values.reshape(4,2)
array([[ 0.19109431, 0.10082081],
[ 0.25133097, 0.52878702],
[ 0.80654863, 0.63821703],
[ 0.23354052, 0.90573699]])
使用cumcount
+ set_index
and reshape by unstack
and stack
:
g = df1.groupby('inst').cumcount()
df1 = df1.set_index(["inst",g]).unstack(0).stack(0).reset_index(level=0, drop=True)
print (df1)
inst A B
var1 0.932293 0.214795
var2 0.503961 0.904046
var1 0.943864 0.232308
var2 0.398277 0.379333
试试这个:
values = np.transpose(df1.values.reshape(2, 4))
df2 = pd.DataFrame(data=values, index=['var1', 'var2', 'var1', 'var2'], columns=['A', 'B'])
print(df2)
我有以下 DataFrame df1
:
df1 = pd.DataFrame(np.random.rand(4,2), columns = {"var1", "var2"})
df1["inst"] = ["A", "A", "B", "B"]
df1.set_index("inst", inplace = True)
df1 = df1.stack()
ipdb> df1
inst
A var1 0.191094
var2 0.100821
var1 0.251331
var2 0.528787
B var1 0.806549
var2 0.638217
var1 0.233541
var2 0.905737
我想重塑 df1
这样
ipdb> df1
A B
var1 0.191094 0.806549
var2 0.100821 0.638217
var1 0.251331 0.233541
var2 0.528787 0.905737
我尝试获取 df1
的值并使用 reshape
函数重塑它们,但没有成功:
ipdb> df1.values
array([ 0.19109431, 0.10082081, 0.25133097, 0.52878702,
0.80654863, 0.63821703, 0.23354052, 0.90573699])
ipdb> df1.values.reshape(4,2)
array([[ 0.19109431, 0.10082081],
[ 0.25133097, 0.52878702],
[ 0.80654863, 0.63821703],
[ 0.23354052, 0.90573699]])
使用cumcount
+ set_index
and reshape by unstack
and stack
:
g = df1.groupby('inst').cumcount()
df1 = df1.set_index(["inst",g]).unstack(0).stack(0).reset_index(level=0, drop=True)
print (df1)
inst A B
var1 0.932293 0.214795
var2 0.503961 0.904046
var1 0.943864 0.232308
var2 0.398277 0.379333
试试这个:
values = np.transpose(df1.values.reshape(2, 4))
df2 = pd.DataFrame(data=values, index=['var1', 'var2', 'var1', 'var2'], columns=['A', 'B'])
print(df2)