如何转置或旋转 table?选择特定列
How to transpose or pivote a table? Selecting specific columns
这里是初学者!
我有一个与此类似的数据框:
df = pd.DataFrame({'Country_Code':['FR','FR','FR','USA','USA','USA','BR','BR','BR'],'Indicator_Name':['GPD','Pop','birth','GPD','Pop','birth','GPD','Pop','birth'],'2005':[14,34,56, 25, 67, 68, 55, 8,99], '2006':[23, 34, 34, 43,34,34, 65, 34,45]})
Index Country_Code Inndicator_Name 2005 2006
0 FR GPD 14 23
1 FR Pop 34 34
2 FR birth 56 34
3 USA GPD 25 43
4 USA Pop 67 34
5 USA birth 68 34
6 BR GPD 55 65
7 BR Pop 8 34
8 BR birth 99 45
我需要旋转或转置它,将国家代码、年份和指标名称保留为列,如下所示:
index Country_Code year GPD Pop Birth
0 FR 2005 14 34 56
1 FR 2006 23 34 34
3 USA 2005 25 67 68
4 USA 2006 43 34 34
...
我使用了这样的转置函数:
df.set_index(['Indicator Name']).transpose()
结果不错,但我将国家/地区排成这样:
Inndicator_Name GPD Pop birth GPD Pop birth GPD Pop birth
Country_Code FR FR FR USA USA USA BR BR BR
2005 14 34 56 25 67 68 55 8 99
2006 23 34 34 43 34 34 65 34 45
我也尝试过使用“pivot”和“pivottable”功能,但结果并不理想。你能给我一些建议吗?
import pandas as pd
df = pd.DataFrame({'Country_Code':['FR','FR','FR','USA','USA','USA','BR','BR','BR'],'Indicator_Name':['GPD','Pop','birth','GPD','Pop','birth','GPD','Pop','birth'],'2005':[14,34,56, 25, 67, 68, 55, 8,99], '2006':[23, 34, 34, 43,34,34, 65, 34,45]})
df
#%% Pivot longer columns `'2005'` and `'2006'` to `'Year'`
df1 = df.melt(id_vars=["Country_Code", "Indicator_Name"],
var_name="Year",
value_name="Value")
#%% Pivot wider by values in `'Indicator_Name'`
df2 = (df1.pivot_table(index=['Country_Code', 'Year'],
columns=['Indicator_Name'],
values=['Value'],
aggfunc='first'))
输出:
Value
Indicator_Name GPD Pop birth
Country_Code Year
BR 2005 55 8 99
2006 65 34 45
FR 2005 14 34 56
2006 23 34 34
USA 2005 25 67 68
2006 43 34 34
我认为最简单的,你可以pivot
+stack
:
(df.pivot(index='Country_Code', columns='Indicator_Name')
.rename_axis(columns=['year', None]).stack(0).reset_index()
)
输出:
Country_Code year GPD Pop birth
0 BR 2005 55 8 99
1 BR 2006 65 34 45
2 FR 2005 14 34 56
3 FR 2006 23 34 34
4 USA 2005 25 67 68
5 USA 2006 43 34 34
这里是初学者! 我有一个与此类似的数据框:
df = pd.DataFrame({'Country_Code':['FR','FR','FR','USA','USA','USA','BR','BR','BR'],'Indicator_Name':['GPD','Pop','birth','GPD','Pop','birth','GPD','Pop','birth'],'2005':[14,34,56, 25, 67, 68, 55, 8,99], '2006':[23, 34, 34, 43,34,34, 65, 34,45]})
Index Country_Code Inndicator_Name 2005 2006 0 FR GPD 14 23 1 FR Pop 34 34 2 FR birth 56 34 3 USA GPD 25 43 4 USA Pop 67 34 5 USA birth 68 34 6 BR GPD 55 65 7 BR Pop 8 34 8 BR birth 99 45
我需要旋转或转置它,将国家代码、年份和指标名称保留为列,如下所示:
index Country_Code year GPD Pop Birth
0 FR 2005 14 34 56
1 FR 2006 23 34 34
3 USA 2005 25 67 68
4 USA 2006 43 34 34
...
我使用了这样的转置函数:
df.set_index(['Indicator Name']).transpose()
结果不错,但我将国家/地区排成这样:
Inndicator_Name GPD Pop birth GPD Pop birth GPD Pop birth
Country_Code FR FR FR USA USA USA BR BR BR
2005 14 34 56 25 67 68 55 8 99
2006 23 34 34 43 34 34 65 34 45
我也尝试过使用“pivot”和“pivottable”功能,但结果并不理想。你能给我一些建议吗?
import pandas as pd
df = pd.DataFrame({'Country_Code':['FR','FR','FR','USA','USA','USA','BR','BR','BR'],'Indicator_Name':['GPD','Pop','birth','GPD','Pop','birth','GPD','Pop','birth'],'2005':[14,34,56, 25, 67, 68, 55, 8,99], '2006':[23, 34, 34, 43,34,34, 65, 34,45]})
df
#%% Pivot longer columns `'2005'` and `'2006'` to `'Year'`
df1 = df.melt(id_vars=["Country_Code", "Indicator_Name"],
var_name="Year",
value_name="Value")
#%% Pivot wider by values in `'Indicator_Name'`
df2 = (df1.pivot_table(index=['Country_Code', 'Year'],
columns=['Indicator_Name'],
values=['Value'],
aggfunc='first'))
输出:
Value
Indicator_Name GPD Pop birth
Country_Code Year
BR 2005 55 8 99
2006 65 34 45
FR 2005 14 34 56
2006 23 34 34
USA 2005 25 67 68
2006 43 34 34
我认为最简单的,你可以pivot
+stack
:
(df.pivot(index='Country_Code', columns='Indicator_Name')
.rename_axis(columns=['year', None]).stack(0).reset_index()
)
输出:
Country_Code year GPD Pop birth
0 BR 2005 55 8 99
1 BR 2006 65 34 45
2 FR 2005 14 34 56
3 FR 2006 23 34 34
4 USA 2005 25 67 68
5 USA 2006 43 34 34