Pandas 累计总和取决于其他列值
Pandas cumulative sum depending on other columns value
我有一个这样的数据集
Date Runner Group distance [km]
2021-01-01 Joe 1 7
2021-01-02 Jack 1 6
2021-01-03 Jess 1 9
2021-01-01 Paul 2 11
2021-01-02 Peter 2 12
2021-01-02 Sara 3 15
2021-01-03 Sarah 3 10
我想计算每组跑步者的累计总和。
Date Runner Group distance [km] cum sum [km]
2021-01-01 Joe 1 7 7
2021-01-02 Jack 1 6 13
2021-01-03 Jess 1 9 22
2021-01-01 Paul 2 11 11
2021-01-02 Peter 2 12 23
2021-01-02 Sara 3 15 15
2021-01-03 Sarah 3 10 25
不幸的是,我不知道该怎么做,也没有在其他地方找到答案。有人可以给我提示吗?
import pandas as pd
import numpy as np
df = pd.DataFrame([['2021-01-01','Joe', 1, 7],
['2021-01-02',"Jack", 1, 6],
['2021-01-03',"Jess", 1, 9],
['2021-01-01',"Paul", 2, 11],
['2021-01-02',"Peter", 2, 12],
['2021-01-02',"Sara", 3, 15],
['2021-01-03',"Sarah", 3, 10]],
columns=['Date','Runner', 'Group', 'distance [km]'])
尝试 groupby
cumsum
:
>>> df['cum sum [km]'] = df.groupby('Group')['distance [km]'].cumsum()
>>> df
Date Runner Group distance [km] cum sum [km]
0 2021-01-01 Joe 1 7 7
1 2021-01-02 Jack 1 6 13
2 2021-01-03 Jess 1 9 22
3 2021-01-01 Paul 2 11 11
4 2021-01-02 Peter 2 12 23
5 2021-01-02 Sara 3 15 15
6 2021-01-03 Sarah 3 10 25
>>>
我有一个这样的数据集
Date Runner Group distance [km]
2021-01-01 Joe 1 7
2021-01-02 Jack 1 6
2021-01-03 Jess 1 9
2021-01-01 Paul 2 11
2021-01-02 Peter 2 12
2021-01-02 Sara 3 15
2021-01-03 Sarah 3 10
我想计算每组跑步者的累计总和。
Date Runner Group distance [km] cum sum [km]
2021-01-01 Joe 1 7 7
2021-01-02 Jack 1 6 13
2021-01-03 Jess 1 9 22
2021-01-01 Paul 2 11 11
2021-01-02 Peter 2 12 23
2021-01-02 Sara 3 15 15
2021-01-03 Sarah 3 10 25
不幸的是,我不知道该怎么做,也没有在其他地方找到答案。有人可以给我提示吗?
import pandas as pd
import numpy as np
df = pd.DataFrame([['2021-01-01','Joe', 1, 7],
['2021-01-02',"Jack", 1, 6],
['2021-01-03',"Jess", 1, 9],
['2021-01-01',"Paul", 2, 11],
['2021-01-02',"Peter", 2, 12],
['2021-01-02',"Sara", 3, 15],
['2021-01-03',"Sarah", 3, 10]],
columns=['Date','Runner', 'Group', 'distance [km]'])
尝试 groupby
cumsum
:
>>> df['cum sum [km]'] = df.groupby('Group')['distance [km]'].cumsum()
>>> df
Date Runner Group distance [km] cum sum [km]
0 2021-01-01 Joe 1 7 7
1 2021-01-02 Jack 1 6 13
2 2021-01-03 Jess 1 9 22
3 2021-01-01 Paul 2 11 11
4 2021-01-02 Peter 2 12 23
5 2021-01-02 Sara 3 15 15
6 2021-01-03 Sarah 3 10 25
>>>