如果条件在另一列中为真,如何按一列分组,将第三列中的值与 pandas 求和
How to group by one column if condition is true in another column summing values in third column with pandas
我想不出该怎么做:
正如标题所解释的那样,我只想在另一列包含 Closed Won
时按列 acquired_month
对数据框进行分组(在示例中,我制作了一个辅助列,如果满足该条件,则只标记 True
尽管我不确定该步骤是否必要)。然后,如果满足这些条件,我想对第三列的值求和,但不知道该怎么做。到目前为止,这是我的代码:
us_lead_scoring.loc[us_lead_scoring['Stage'].str.contains('Closed Won'), 'closed_won_binary'] = True acquired_date = us_lead_scoring.groupby('acquired_month')['closed_won_binary'].sum()
但是如果真假列在 acquired_month
groupby 之后为真,那么这只是对真假列求和而不是 sum
列。任何方向表示赞赏。
谢谢
如果需要聚合列 col
将不匹配的值替换为 Series.where
中的 0
值,然后聚合 sum
:
us_lead_scoring = pd.DataFrame({'Stage':['Closed Won1','Closed Won2','Closed', 'Won'],
'col':[1,3,5,6],
'acquired_month':[1,1,1,2]})
out = (us_lead_scoring['col'].where(us_lead_scoring['Stage']
.str.contains('Closed Won'), 0)
.groupby(us_lead_scoring['acquired_month'])
.sum()
.reset_index(name='SUM'))
print (out)
acquired_month SUM
0 1 4
1 2 0
我想不出该怎么做:
正如标题所解释的那样,我只想在另一列包含 Closed Won
时按列 acquired_month
对数据框进行分组(在示例中,我制作了一个辅助列,如果满足该条件,则只标记 True
尽管我不确定该步骤是否必要)。然后,如果满足这些条件,我想对第三列的值求和,但不知道该怎么做。到目前为止,这是我的代码:
us_lead_scoring.loc[us_lead_scoring['Stage'].str.contains('Closed Won'), 'closed_won_binary'] = True acquired_date = us_lead_scoring.groupby('acquired_month')['closed_won_binary'].sum()
但是如果真假列在 acquired_month
groupby 之后为真,那么这只是对真假列求和而不是 sum
列。任何方向表示赞赏。
谢谢
如果需要聚合列 col
将不匹配的值替换为 Series.where
中的 0
值,然后聚合 sum
:
us_lead_scoring = pd.DataFrame({'Stage':['Closed Won1','Closed Won2','Closed', 'Won'],
'col':[1,3,5,6],
'acquired_month':[1,1,1,2]})
out = (us_lead_scoring['col'].where(us_lead_scoring['Stage']
.str.contains('Closed Won'), 0)
.groupby(us_lead_scoring['acquired_month'])
.sum()
.reset_index(name='SUM'))
print (out)
acquired_month SUM
0 1 4
1 2 0