通过 Python 按日期计算列值的范围

Calculating the range of column value datewise through Python

我想根据日期计算 product_mrp 的最大差异。 为此,我试图按日期分组,但在那之后无法得到。

输入:

+-------------+--------------------+
| product_mrp |     order_date     |
+-------------+--------------------+
|         142 |         01-12-2019 |
|          20 |         01-12-2019 |
|          20 |         01-12-2019 |
|         120 |         01-12-2019 |
|          30 |         03-12-2019 |
|          20 |         03-12-2019 |
|          45 |         03-12-2019 |
|         215 |         03-12-2019 |
|          15 |         03-12-2019 |
|          25 |         07-12-2019 |
|           5 |         07-12-2019 |
+-------------+--------------------+

预期输出:

 +-------------+--------------------+
| product_mrp |     order_date     |
+-------------+--------------------+
|         122 |         01-12-2019 |
|         200 |         03-12-2019 |
|          20 |         07-12-2019 |
+-------------+--------------------+

使用pandas加载数据,然后使用groupby按共享索引分组:

import pandas as pd

dates = ['01-12-2019']*4 + ['03-12-2019']*5 + ['07-12-2019']*2
data = [142,20,20,120,30,20,45,215,15,25,5]

df = pd.DataFrame(data,)
df.index = pd.DatetimeIndex(dates)

grouped = df.groupby(df.index).apply(lambda x: x.max()-x.min())

输出:

            product mrp
2019-01-12          122
2019-03-12          200
2019-07-12           20

你可以像你说的那样使用 groupbymaxminreset_index 比如:

gr = df.groupby('order_date')['product_mrp']
df_ = (gr.max()-gr.min()).reset_index()

print (df_)
   order_date  product_mrp
0  01-12-2019          122
1  03-12-2019          200
2  07-12-2019           20