将行转换为列 Headers
Convert Rows as Column Headers
我有以下数据框:
---------------------------------------------------------------
| | TORA | PS | Hutan Adat | Tahun | Bulan | Dashboard Name |
---------------------------------------------------------------
|0| 0 | 0 | 0 | 2021 | Jan | Potensi |
|1| 0 | 0 | 0 | 2021 | Jan | Usulan |
|2| 15 | 0 | 3 | 2021 | Jan | Realisasi |
|3| 4 | 6 | 0 | 2021 | Feb | Potensi |
|4| 0 | 0 | 0 | 2021 | Feb | Usulan |
---------------------------------------------------------------
我想要这样。
------------------------------------------------------------
| Potensi | Usulan | Realisasi | Tahun | Bulan |
------------------------------------------------------------
TORA | 0 | 0 | 15 | 2021 | Jan |
PS | 0 | 0 | 0 | 2021 | Jan |
Hutan Adat | 0 | 0 | 3 | 2021 | Jan |
TORA | 4 | 0 | NaN | 2021 | Feb |
PS | 6 | 0 | NaN | 2021 | Feb |
Hutan Adat | 0 | 0 | NaN | 2021 | Feb |
------------------------------------------------------------
我试过:
#1
df = df.set_index('Dashboard Name').T
#2
df = df.pivot_table(['TORA','PS','Hutan Adat'],['Tahun','Bulan'],'Dashboard Name')
#3
df = df.set_index(['Dashboard Name', 'Tahun', 'Bulan'], drop = True).T
但我没有得到我想要的预期输出
首先将 Bulan
列设置为有序分类,以便正确排序:
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep',
'Oct', 'Nov', 'Dec']
df.Bulan = pd.Categorical(df.Bulan,
ordered=True,
categories=months)
然后通过旋转 melt
重塑:
df = (df.melt(['Tahun','Bulan','Dashboard Name'])
.pivot(index=['variable','Tahun','Bulan'],
columns='Dashboard Name',
values='value')
.sort_index(level=[1,2])
.reset_index(level=[1,2])
.rename_axis(index=None, columns=None)
)
print (df)
Tahun Bulan Potensi Realisasi Usulan
Hutan Adat 2021 Jan 0.0 3.0 0.0
PS 2021 Jan 0.0 0.0 0.0
TORA 2021 Jan 0.0 15.0 0.0
Hutan Adat 2021 Feb 0.0 NaN 0.0
PS 2021 Feb 6.0 NaN 0.0
TORA 2021 Feb 4.0 NaN 0.0
或:
df = (df.set_index(['Tahun','Bulan','Dashboard Name'])
.stack()
.unstack(level=2)
.reset_index(level=[0,1])
.rename_axis(index=None, columns=None)
)
print (df)
Tahun Bulan Potensi Realisasi Usulan
TORA 2021 Jan 0.0 15.0 0.0
PS 2021 Jan 0.0 0.0 0.0
Hutan Adat 2021 Jan 0.0 3.0 0.0
TORA 2021 Feb 4.0 NaN 0.0
PS 2021 Feb 6.0 NaN 0.0
Hutan Adat 2021 Feb 0.0 NaN 0.0
最后如果最后一列的顺序很重要:
df = df[df.columns[2:].tolist() + df.columns[:2].tolist()]
print (df)
Potensi Realisasi Usulan Tahun Bulan
Hutan Adat 0.0 3.0 0.0 2021 Jan
PS 0.0 0.0 0.0 2021 Jan
TORA 0.0 15.0 0.0 2021 Jan
Hutan Adat 0.0 NaN 0.0 2021 Feb
PS 6.0 NaN 0.0 2021 Feb
TORA 4.0 NaN 0.0 2021 Feb
我有以下数据框:
---------------------------------------------------------------
| | TORA | PS | Hutan Adat | Tahun | Bulan | Dashboard Name |
---------------------------------------------------------------
|0| 0 | 0 | 0 | 2021 | Jan | Potensi |
|1| 0 | 0 | 0 | 2021 | Jan | Usulan |
|2| 15 | 0 | 3 | 2021 | Jan | Realisasi |
|3| 4 | 6 | 0 | 2021 | Feb | Potensi |
|4| 0 | 0 | 0 | 2021 | Feb | Usulan |
---------------------------------------------------------------
我想要这样。
------------------------------------------------------------
| Potensi | Usulan | Realisasi | Tahun | Bulan |
------------------------------------------------------------
TORA | 0 | 0 | 15 | 2021 | Jan |
PS | 0 | 0 | 0 | 2021 | Jan |
Hutan Adat | 0 | 0 | 3 | 2021 | Jan |
TORA | 4 | 0 | NaN | 2021 | Feb |
PS | 6 | 0 | NaN | 2021 | Feb |
Hutan Adat | 0 | 0 | NaN | 2021 | Feb |
------------------------------------------------------------
我试过: #1
df = df.set_index('Dashboard Name').T
#2
df = df.pivot_table(['TORA','PS','Hutan Adat'],['Tahun','Bulan'],'Dashboard Name')
#3
df = df.set_index(['Dashboard Name', 'Tahun', 'Bulan'], drop = True).T
但我没有得到我想要的预期输出
首先将 Bulan
列设置为有序分类,以便正确排序:
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep',
'Oct', 'Nov', 'Dec']
df.Bulan = pd.Categorical(df.Bulan,
ordered=True,
categories=months)
然后通过旋转 melt
重塑:
df = (df.melt(['Tahun','Bulan','Dashboard Name'])
.pivot(index=['variable','Tahun','Bulan'],
columns='Dashboard Name',
values='value')
.sort_index(level=[1,2])
.reset_index(level=[1,2])
.rename_axis(index=None, columns=None)
)
print (df)
Tahun Bulan Potensi Realisasi Usulan
Hutan Adat 2021 Jan 0.0 3.0 0.0
PS 2021 Jan 0.0 0.0 0.0
TORA 2021 Jan 0.0 15.0 0.0
Hutan Adat 2021 Feb 0.0 NaN 0.0
PS 2021 Feb 6.0 NaN 0.0
TORA 2021 Feb 4.0 NaN 0.0
或:
df = (df.set_index(['Tahun','Bulan','Dashboard Name'])
.stack()
.unstack(level=2)
.reset_index(level=[0,1])
.rename_axis(index=None, columns=None)
)
print (df)
Tahun Bulan Potensi Realisasi Usulan
TORA 2021 Jan 0.0 15.0 0.0
PS 2021 Jan 0.0 0.0 0.0
Hutan Adat 2021 Jan 0.0 3.0 0.0
TORA 2021 Feb 4.0 NaN 0.0
PS 2021 Feb 6.0 NaN 0.0
Hutan Adat 2021 Feb 0.0 NaN 0.0
最后如果最后一列的顺序很重要:
df = df[df.columns[2:].tolist() + df.columns[:2].tolist()]
print (df)
Potensi Realisasi Usulan Tahun Bulan
Hutan Adat 0.0 3.0 0.0 2021 Jan
PS 0.0 0.0 0.0 2021 Jan
TORA 0.0 15.0 0.0 2021 Jan
Hutan Adat 0.0 NaN 0.0 2021 Feb
PS 6.0 NaN 0.0 2021 Feb
TORA 4.0 NaN 0.0 2021 Feb