我可以在 python 中对不同时间的值求和吗?
Can I sum a value over different times in python?
我有一个看起来像这样的数据框
WEEK
DELIVERY_BOY_ID
TOTAL_GMV
2022-04-04
999999999.0
470510.11
2022-04-11
999999999.0
557351.02
2022-04-18
999999999.0
454225.78
2022-04-25
999999999.0
527932.46
2022-05-02
999999999.0
556741.18
2022-05-09
999999999.0
524571.93
2022-05-16
999999999.0
547195.66
2022-05-23
999999999.0
112423.49
我想做的是每 4 周对 TOTAL_GMV 求和(从 2022-05-02 到 2022-05-23 周的总和,从 2022-04-25 到 2022-每周 05-16 等等),并向我展示最后一周日期的结果
所以,最终结果应该是这样的:
WEEK
DELIVERY_BOY_ID
TOTAL_GMV
EXPLANATION
2022-04-04
999999999.0
*********
Sum from 2022-03-14 to 2022-04-04
2022-04-11
999999999.0
*********
Sum from 2022-03-21 to 2022-04-11
2022-04-18
999999999.0
*********
Sum from 2022-03-28 to 2022-04-18
2022-04-25
999999999.0
2.010.018,91
Sum from 2022-04-04 to 2022-04-25
2022-05-02
999999999.0
2.096.250,44
Sum from 2022-04-11 to 2022-05-02
2022-05-09
999999999.0
2.063.469,15
Sum from 2022-04-18 to 2022-05-09
2022-05-16
999999999.0
2.156.441,23
Sum from 2022-04-25 to 2022-05-16
2022-05-23
999999999.0
1.639.932,26
Sum from 2022-05-02 to 2022-05-23
知道怎么做吗?
谢谢!!
假设WEEK
是索引,你可以这样做:
>>> df.TOTAL_GMV.rolling(4).sum()
WEEK
2022-04-04 NaN
2022-04-11 NaN
2022-04-18 NaN
2022-04-25 2010019.37
2022-05-02 2096250.44
2022-05-09 2063471.35
2022-05-16 2156441.23
2022-05-23 1740932.26
Name: TOTAL_GMV, dtype: float64
要将其添加到 df,
df['TOTAL_GMV'] = df.TOTAL_GMV.rolling(4).sum()
(如果不是索引,请将其更改为 df.set_index('WEEK').TOTAL_GMV.rolling(4).sum()
。)
我有一个看起来像这样的数据框
WEEK | DELIVERY_BOY_ID | TOTAL_GMV |
---|---|---|
2022-04-04 | 999999999.0 | 470510.11 |
2022-04-11 | 999999999.0 | 557351.02 |
2022-04-18 | 999999999.0 | 454225.78 |
2022-04-25 | 999999999.0 | 527932.46 |
2022-05-02 | 999999999.0 | 556741.18 |
2022-05-09 | 999999999.0 | 524571.93 |
2022-05-16 | 999999999.0 | 547195.66 |
2022-05-23 | 999999999.0 | 112423.49 |
我想做的是每 4 周对 TOTAL_GMV 求和(从 2022-05-02 到 2022-05-23 周的总和,从 2022-04-25 到 2022-每周 05-16 等等),并向我展示最后一周日期的结果
所以,最终结果应该是这样的:
WEEK | DELIVERY_BOY_ID | TOTAL_GMV | EXPLANATION |
---|---|---|---|
2022-04-04 | 999999999.0 | ********* | Sum from 2022-03-14 to 2022-04-04 |
2022-04-11 | 999999999.0 | ********* | Sum from 2022-03-21 to 2022-04-11 |
2022-04-18 | 999999999.0 | ********* | Sum from 2022-03-28 to 2022-04-18 |
2022-04-25 | 999999999.0 | 2.010.018,91 | Sum from 2022-04-04 to 2022-04-25 |
2022-05-02 | 999999999.0 | 2.096.250,44 | Sum from 2022-04-11 to 2022-05-02 |
2022-05-09 | 999999999.0 | 2.063.469,15 | Sum from 2022-04-18 to 2022-05-09 |
2022-05-16 | 999999999.0 | 2.156.441,23 | Sum from 2022-04-25 to 2022-05-16 |
2022-05-23 | 999999999.0 | 1.639.932,26 | Sum from 2022-05-02 to 2022-05-23 |
知道怎么做吗?
谢谢!!
假设WEEK
是索引,你可以这样做:
>>> df.TOTAL_GMV.rolling(4).sum()
WEEK
2022-04-04 NaN
2022-04-11 NaN
2022-04-18 NaN
2022-04-25 2010019.37
2022-05-02 2096250.44
2022-05-09 2063471.35
2022-05-16 2156441.23
2022-05-23 1740932.26
Name: TOTAL_GMV, dtype: float64
要将其添加到 df,
df['TOTAL_GMV'] = df.TOTAL_GMV.rolling(4).sum()
(如果不是索引,请将其更改为 df.set_index('WEEK').TOTAL_GMV.rolling(4).sum()
。)