Pandas:根据当前行的值进行分组和条件求和
Pandas: Group by and conditional sum based on value of current row
我的数据框如下所示:
customer_nr
order_value
year_ordered
payment_successful
1
50
1980
1
1
75
2017
0
1
10
2020
1
2
55
2000
1
2
300
2007
1
2
15
2010
0
我想知道客户在前几年为特定订单成功支付的总金额。
预期输出如下:
customer_nr
order_value
year_ordered
payment_successful
total_successfully_previously_paid
1
50
1980
1
0
1
75
2017
0
50
1
10
2020
1
50
2
55
2000
1
0
2
300
2007
1
55
2
15
2010
0
355
我得到的最接近的是:
df.groupby(['customer_nr', 'payment_successful'], as_index=False)['order_value'].sum()
这只是向我提供了每个客户所有时间成功和未成功支付的总金额。不考虑只选择之前的订单参与合计。
感谢任何帮助!
尝试:
df["total_successfully_previously_paid"] = (df["payment_successful"].mul(df["order_value"])
.groupby(df["customer_nr"])
.transform(lambda x: x.cumsum().shift().fillna(0))
)
>>> df
customer_nr ... total_successfully_previously_paid
0 1 ... 0.0
1 1 ... 50.0
2 1 ... 50.0
3 2 ... 0.0
4 2 ... 55.0
5 2 ... 355.0
[6 rows x 5 columns]
我的数据框如下所示:
customer_nr | order_value | year_ordered | payment_successful |
---|---|---|---|
1 | 50 | 1980 | 1 |
1 | 75 | 2017 | 0 |
1 | 10 | 2020 | 1 |
2 | 55 | 2000 | 1 |
2 | 300 | 2007 | 1 |
2 | 15 | 2010 | 0 |
我想知道客户在前几年为特定订单成功支付的总金额。
预期输出如下:
customer_nr | order_value | year_ordered | payment_successful | total_successfully_previously_paid |
---|---|---|---|---|
1 | 50 | 1980 | 1 | 0 |
1 | 75 | 2017 | 0 | 50 |
1 | 10 | 2020 | 1 | 50 |
2 | 55 | 2000 | 1 | 0 |
2 | 300 | 2007 | 1 | 55 |
2 | 15 | 2010 | 0 | 355 |
我得到的最接近的是:
df.groupby(['customer_nr', 'payment_successful'], as_index=False)['order_value'].sum()
这只是向我提供了每个客户所有时间成功和未成功支付的总金额。不考虑只选择之前的订单参与合计。
感谢任何帮助!
尝试:
df["total_successfully_previously_paid"] = (df["payment_successful"].mul(df["order_value"])
.groupby(df["customer_nr"])
.transform(lambda x: x.cumsum().shift().fillna(0))
)
>>> df
customer_nr ... total_successfully_previously_paid
0 1 ... 0.0
1 1 ... 50.0
2 1 ... 50.0
3 2 ... 0.0
4 2 ... 55.0
5 2 ... 355.0
[6 rows x 5 columns]