GroupBy 语句不像字符串那样分组

Question

晚上，

我的数据：

display(dfRFQ_Breakdown_By_Done_Traded_Away_Grp.sort_values('security_type1',ascending=True))

    state     security_type1    count
0   Done             CORP           239
4   Tied Done        CORP            9
6   Tied Traded Away CORP            7
9   Traded Away      CORP          1075
1   Done             GOVT           40
5   Tied Done        GOVT           2
7   Tied Traded Away GOVT           16
10  Traded Away      GOVT          150
2   Done             MTGE           4
8   Tied Traded Away MTGE           3
11  Traded Away      MTGE           7
3   Done            SUPRA           31
12  Traded Away     SUPRA           88

我想将具有 'Done' 或 'Traded Away' 状态的所有行分组在一起，每个 security_type1:

state     security_type1    count
Done        CORP             248
Traded Away CORP             1082
Done        GOVT             42
Traded Away GOVT             166
Done        MTGE             4
Traded Away MTGE             10
Done        SUPRA            31
Traded Away SUPRA            88

我的代码：

# Updating any Tied Done to Done and Tied Traded Away to Traded Away  
mask = (dfRFQ_Breakdown_By_Done_Traded_Away_Grp['state'].str.contains('Tied Done'))       
dfRFQ_Breakdown_By_Done_Traded_Away_Grp.loc[mask, 'state'] = 'Done'

mask = (dfRFQ_Breakdown_By_Done_Traded_Away_Grp['state'].str.contains('Tied Traded Away'))       
dfRFQ_Breakdown_By_Done_Traded_Away_Grp.loc[mask, 'state'] = 'Traded Away'
display(dfRFQ_Breakdown_By_Done_Traded_Away_Grp.sort_values('security_type1',ascending=True))

更新后的字符串似乎按 pandas:

单独分组

state   security_type1  count
Done         CORP        239
Done         CORP        9
Traded Away  CORP        7
Traded Away  CORP        1075
Done         GOVT        40
Done         GOVT        2
Traded Away  GOVT        16
Traded Away  GOVT        150
Done         MTGE        4
Traded Away  MTGE        3
Traded Away  MTGE        7
Done         SUPRA       31
Traded Away  SUPRA       88

pandas 没有将 Done 和 Traded Away 的实例组合在一起的原因是什么？我是否需要创建数据框的另一个副本。几乎就像 pandas 对更新前的旧值有一个 link。

Answer 1

query, groupby and sort_values这似乎是可能的：

res = df.query('(state == "Done") | (state == "TradedAway")')\
        .groupby(['state', 'security_type1'], as_index=False)['count'].sum()\
        .sort_values(['security_type1', 'state'])

print(res)

        state security_type1  count
0        Done           CORP    239
4  TradedAway           CORP   1075
1        Done           GOVT     40
5  TradedAway           GOVT    150
2        Done           MTGE      4
6  TradedAway           MTGE      7
3        Done          SUPRA     31
7  TradedAway          SUPRA     88

GroupBy 语句不像字符串那样分组

GroupBy Statement not grouping like strings

python

group-by

mask

pandas