如何通过在两行之间划分特定列中的值并保持其他列不变来在 pandas 数据框中创建新行?
How to create a new row in pandas dataframe by dividing values in a specific column between two rows and keeping other columns intact?
我有一个 pandas 数据框 df
如下所示:
Country Continent Capital City Year Indicator Value Unit
0 Nepal Asia Kathmandu 2015 Population 3 million
1 Nepal Asia Kathmandu 2020 Population 5 million
2 Germany Europe Berlin 2015 Population 4 million
3 Germany Europe Berlin 2020 Population 6 million
df.to_dict()
如下图:
{'Country': {0: 'Nepal', 1: 'Nepal', 2: 'Germany', 3: 'Germany'},
'Continent': {0: 'Asia', 1: 'Asia', 2: 'Europe', 3: 'Europe'},
'Capital City': {0: 'Kathmandu', 1: 'Kathmandu', 2: 'Berlin', 3: 'Berlin'},
'Year': {0: 2015, 1: 2020, 2: 2015, 3: 2020},
'Indicator': {0: 'Population',
1: 'Population',
2: 'Population',
3: 'Population'},
'Value': {0: 3, 1: 5, 2: 4, 3: 6},
'Unit': {0: 'million', 1: 'million', 2: 'million', 3: 'million'}}
数据框由尼泊尔和德国两国首都分别在 2015 年和 2020 年的人口数据组成。
我想创建两个新行,它们显示 2015 年到 2020 年之间的人口增长率(例如,尼泊尔为 5/3,即 1.67,德国为 6/4,即 1.5)。这些行需要在同一个数据框中。在新行中,Country、Continent 和 Capital City 列对于各自的国家应该保持不变。年份值保持2020,指标名称需为“人口增长率”,单位需为“乘以2015年值”。它应该如下所示:
Country Continent Capital City Year Indicator Value Unit
0 Nepal Asia Kathmandu 2015 Population 3 million
1 Nepal Asia Kathmandu 2020 Population 5 million
2 Germany Europe Berlin 2015 Population 4 million
3 Germany Europe Berlin 2020 Population 6 million
4 Nepal Asia Kathmandu 2020 Population growth rate 1.666667 times 2015 value
5 Germany Europe Berlin 2020 Population growth rate 1.5 times 2015 value
如何在原始数据框中附加人口增长率来创建这两个新行?
用 groupby
然后 append
out = df.groupby(['Country','Continent','Capital City']).agg({'Year':'last','Value':lambda x : x.iloc[-1]/x.iloc[0]}).reset_index()
out['Indicator'] = 'Population growth rate'
df = df.append(out)
df
Out[16]:
Country Continent Capital City ... Indicator Value Unit
0 Nepal Asia Kathmandu ... Population 3.000000 million
1 Nepal Asia Kathmandu ... Population 5.000000 million
2 Germany Europe Berlin ... Population 4.000000 million
3 Germany Europe Berlin ... Population 6.000000 million
0 Germany Europe Berlin ... Population growth rate 1.500000 NaN
1 Nepal Asia Kathmandu ... Population growth rate 1.666667 NaN
[6 rows x 7 columns]
我有一个 pandas 数据框 df
如下所示:
Country Continent Capital City Year Indicator Value Unit
0 Nepal Asia Kathmandu 2015 Population 3 million
1 Nepal Asia Kathmandu 2020 Population 5 million
2 Germany Europe Berlin 2015 Population 4 million
3 Germany Europe Berlin 2020 Population 6 million
df.to_dict()
如下图:
{'Country': {0: 'Nepal', 1: 'Nepal', 2: 'Germany', 3: 'Germany'},
'Continent': {0: 'Asia', 1: 'Asia', 2: 'Europe', 3: 'Europe'},
'Capital City': {0: 'Kathmandu', 1: 'Kathmandu', 2: 'Berlin', 3: 'Berlin'},
'Year': {0: 2015, 1: 2020, 2: 2015, 3: 2020},
'Indicator': {0: 'Population',
1: 'Population',
2: 'Population',
3: 'Population'},
'Value': {0: 3, 1: 5, 2: 4, 3: 6},
'Unit': {0: 'million', 1: 'million', 2: 'million', 3: 'million'}}
数据框由尼泊尔和德国两国首都分别在 2015 年和 2020 年的人口数据组成。
我想创建两个新行,它们显示 2015 年到 2020 年之间的人口增长率(例如,尼泊尔为 5/3,即 1.67,德国为 6/4,即 1.5)。这些行需要在同一个数据框中。在新行中,Country、Continent 和 Capital City 列对于各自的国家应该保持不变。年份值保持2020,指标名称需为“人口增长率”,单位需为“乘以2015年值”。它应该如下所示:
Country Continent Capital City Year Indicator Value Unit
0 Nepal Asia Kathmandu 2015 Population 3 million
1 Nepal Asia Kathmandu 2020 Population 5 million
2 Germany Europe Berlin 2015 Population 4 million
3 Germany Europe Berlin 2020 Population 6 million
4 Nepal Asia Kathmandu 2020 Population growth rate 1.666667 times 2015 value
5 Germany Europe Berlin 2020 Population growth rate 1.5 times 2015 value
如何在原始数据框中附加人口增长率来创建这两个新行?
用 groupby
然后 append
out = df.groupby(['Country','Continent','Capital City']).agg({'Year':'last','Value':lambda x : x.iloc[-1]/x.iloc[0]}).reset_index()
out['Indicator'] = 'Population growth rate'
df = df.append(out)
df
Out[16]:
Country Continent Capital City ... Indicator Value Unit
0 Nepal Asia Kathmandu ... Population 3.000000 million
1 Nepal Asia Kathmandu ... Population 5.000000 million
2 Germany Europe Berlin ... Population 4.000000 million
3 Germany Europe Berlin ... Population 6.000000 million
0 Germany Europe Berlin ... Population growth rate 1.500000 NaN
1 Nepal Asia Kathmandu ... Population growth rate 1.666667 NaN
[6 rows x 7 columns]