Python Pandas 中 DataFrame 的复杂聚合?

Complicated aggregation of DataFrame in Python Pandas?

我有如下所示的 DataFrame:

df = pd.DataFrame({"VALUE" : [100, 200, 100, 300, 500],
                   "PRODUCT_ID" : [599, 200, 599, 599, 200],
                   "STATUS" : ["active", "active", "not_active", "unknown", "active"], 
                   "CLIENT" : ["1", "1", "2", "2", "1"]})

我需要计算平均值中值最大值VALUE每个 PRODUCT_ID 每个 CLIENT 都有“活动” STATUS。我需要 df 这样的东西:

AVG = 266,6 因为:(500+200+100) : 3
MED = 200
MAX = 500 因为 500 是客户端 1

的最大活动聚合值

尝试:

(df.query('STATUS=="active"')
        .groupby(['CLIENT'])['VALUE']
        .agg(['mean','median','max'])
        .reindex(df.CLIENT.unique())
     )

输出:

              mean  median    max
CLIENT                           
1       266.666667   200.0  500.0
2              NaN     NaN    NaN

你能试试这个吗:

  df[df['STATUS'] == 'active'].groupby(['PRODUCT_ID', 'CLIENT']).agg(['mean','median','max'])

输出:

                 VALUE
                 mean   median  max
         PRODUCT_ID CLIENT          
          200       1   350 350 500
          599       1   100 100 100