按每个组分组,然后进行两个分区的划分(每列当前 yr/latest 年)
Groupby each group and then do division of two divisions (current yr/latest yr for each column)
我想通过在 Col_1 和 Col_2 中分别为每个组除以当前年份除以其最近年份来创建一个新列。然后,划分两个部分。
方法:计算每个组的 (EachYrCol_1/Yr2000Col_1)/(EachYrCol_2/Yr2000Col_2)
参见下面的示例:
Year
Group
Col_1
Col_2
New Column
1995
A
100
11
(100/600)/(11/66)
1996
A
200
22
(200/600)/(22/66)
1997
A
300
33
(300/600)/(33/66)
1998
A
400
44
.............
1999
A
500
55
.............
2000
A
600
66
.............
1995
B
700
77
(700/1200)/(77/399)
1996
B
800
88
(800/1200)/(88/399)
1997
B
900
99
(900/1200)/(99/399)
1998
B
1000
199
.............
1999
B
1100
299
.............
2000
B
1200
399
.............
示例数据集:
import pandas as pd
df = pd.DataFrame({'Year':[1995, 1996, 1997, 1998, 1999, 2000,1995, 1996, 1997, 1998, 1999, 2000],
'Group':['A', 'A', 'A','A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B'],
'Col_1':[100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200],
'Col_2':[11, 22, 33, 44, 55, 66, 77, 88, 99, 199, 299, 399]})
使用 GroupBy.transform
with GroupBy.last
作为辅助 DataFrame,因此可以划分每一列:
df1 = df.groupby('Group').transform('last')
df['New'] = df['Col_1'].div(df1['Col_1']).div(df['Col_2'].div(df1['Col_2']))
print (df)
Year Group Col_1 Col_2 New
0 1995 A 100 11 1.000000
1 1996 A 200 22 1.000000
2 1997 A 300 33 1.000000
3 1998 A 400 44 1.000000
4 1999 A 500 55 1.000000
5 2000 A 600 66 1.000000
6 1995 B 700 77 3.022727
7 1996 B 800 88 3.022727
8 1997 B 900 99 3.022727
9 1998 B 1000 199 1.670854
10 1999 B 1100 299 1.223244
11 2000 B 1200 399 1.000000
我想通过在 Col_1 和 Col_2 中分别为每个组除以当前年份除以其最近年份来创建一个新列。然后,划分两个部分。
方法:计算每个组的 (EachYrCol_1/Yr2000Col_1)/(EachYrCol_2/Yr2000Col_2)
参见下面的示例:
Year | Group | Col_1 | Col_2 | New Column |
---|---|---|---|---|
1995 | A | 100 | 11 | (100/600)/(11/66) |
1996 | A | 200 | 22 | (200/600)/(22/66) |
1997 | A | 300 | 33 | (300/600)/(33/66) |
1998 | A | 400 | 44 | ............. |
1999 | A | 500 | 55 | ............. |
2000 | A | 600 | 66 | ............. |
1995 | B | 700 | 77 | (700/1200)/(77/399) |
1996 | B | 800 | 88 | (800/1200)/(88/399) |
1997 | B | 900 | 99 | (900/1200)/(99/399) |
1998 | B | 1000 | 199 | ............. |
1999 | B | 1100 | 299 | ............. |
2000 | B | 1200 | 399 | ............. |
示例数据集:
import pandas as pd
df = pd.DataFrame({'Year':[1995, 1996, 1997, 1998, 1999, 2000,1995, 1996, 1997, 1998, 1999, 2000],
'Group':['A', 'A', 'A','A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B'],
'Col_1':[100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200],
'Col_2':[11, 22, 33, 44, 55, 66, 77, 88, 99, 199, 299, 399]})
使用 GroupBy.transform
with GroupBy.last
作为辅助 DataFrame,因此可以划分每一列:
df1 = df.groupby('Group').transform('last')
df['New'] = df['Col_1'].div(df1['Col_1']).div(df['Col_2'].div(df1['Col_2']))
print (df)
Year Group Col_1 Col_2 New
0 1995 A 100 11 1.000000
1 1996 A 200 22 1.000000
2 1997 A 300 33 1.000000
3 1998 A 400 44 1.000000
4 1999 A 500 55 1.000000
5 2000 A 600 66 1.000000
6 1995 B 700 77 3.022727
7 1996 B 800 88 3.022727
8 1997 B 900 99 3.022727
9 1998 B 1000 199 1.670854
10 1999 B 1100 299 1.223244
11 2000 B 1200 399 1.000000