GroupBy 语句不像字符串那样分组
GroupBy Statement not grouping like strings
晚上,
我的数据:
display(dfRFQ_Breakdown_By_Done_Traded_Away_Grp.sort_values('security_type1',ascending=True))
state security_type1 count
0 Done CORP 239
4 Tied Done CORP 9
6 Tied Traded Away CORP 7
9 Traded Away CORP 1075
1 Done GOVT 40
5 Tied Done GOVT 2
7 Tied Traded Away GOVT 16
10 Traded Away GOVT 150
2 Done MTGE 4
8 Tied Traded Away MTGE 3
11 Traded Away MTGE 7
3 Done SUPRA 31
12 Traded Away SUPRA 88
我想将具有 'Done' 或 'Traded Away' 状态的所有行分组在一起,每个 security_type1:
state security_type1 count
Done CORP 248
Traded Away CORP 1082
Done GOVT 42
Traded Away GOVT 166
Done MTGE 4
Traded Away MTGE 10
Done SUPRA 31
Traded Away SUPRA 88
我的代码:
# Updating any Tied Done to Done and Tied Traded Away to Traded Away
mask = (dfRFQ_Breakdown_By_Done_Traded_Away_Grp['state'].str.contains('Tied Done'))
dfRFQ_Breakdown_By_Done_Traded_Away_Grp.loc[mask, 'state'] = 'Done'
mask = (dfRFQ_Breakdown_By_Done_Traded_Away_Grp['state'].str.contains('Tied Traded Away'))
dfRFQ_Breakdown_By_Done_Traded_Away_Grp.loc[mask, 'state'] = 'Traded Away'
display(dfRFQ_Breakdown_By_Done_Traded_Away_Grp.sort_values('security_type1',ascending=True))
更新后的字符串似乎按 pandas:
单独分组
state security_type1 count
Done CORP 239
Done CORP 9
Traded Away CORP 7
Traded Away CORP 1075
Done GOVT 40
Done GOVT 2
Traded Away GOVT 16
Traded Away GOVT 150
Done MTGE 4
Traded Away MTGE 3
Traded Away MTGE 7
Done SUPRA 31
Traded Away SUPRA 88
pandas 没有将 Done 和 Traded Away 的实例组合在一起的原因是什么?我是否需要创建数据框的另一个副本。几乎就像 pandas 对更新前的旧值有一个 link。
query
, groupby
and sort_values
这似乎是可能的:
res = df.query('(state == "Done") | (state == "TradedAway")')\
.groupby(['state', 'security_type1'], as_index=False)['count'].sum()\
.sort_values(['security_type1', 'state'])
print(res)
state security_type1 count
0 Done CORP 239
4 TradedAway CORP 1075
1 Done GOVT 40
5 TradedAway GOVT 150
2 Done MTGE 4
6 TradedAway MTGE 7
3 Done SUPRA 31
7 TradedAway SUPRA 88
晚上,
我的数据:
display(dfRFQ_Breakdown_By_Done_Traded_Away_Grp.sort_values('security_type1',ascending=True))
state security_type1 count
0 Done CORP 239
4 Tied Done CORP 9
6 Tied Traded Away CORP 7
9 Traded Away CORP 1075
1 Done GOVT 40
5 Tied Done GOVT 2
7 Tied Traded Away GOVT 16
10 Traded Away GOVT 150
2 Done MTGE 4
8 Tied Traded Away MTGE 3
11 Traded Away MTGE 7
3 Done SUPRA 31
12 Traded Away SUPRA 88
我想将具有 'Done' 或 'Traded Away' 状态的所有行分组在一起,每个 security_type1:
state security_type1 count
Done CORP 248
Traded Away CORP 1082
Done GOVT 42
Traded Away GOVT 166
Done MTGE 4
Traded Away MTGE 10
Done SUPRA 31
Traded Away SUPRA 88
我的代码:
# Updating any Tied Done to Done and Tied Traded Away to Traded Away
mask = (dfRFQ_Breakdown_By_Done_Traded_Away_Grp['state'].str.contains('Tied Done'))
dfRFQ_Breakdown_By_Done_Traded_Away_Grp.loc[mask, 'state'] = 'Done'
mask = (dfRFQ_Breakdown_By_Done_Traded_Away_Grp['state'].str.contains('Tied Traded Away'))
dfRFQ_Breakdown_By_Done_Traded_Away_Grp.loc[mask, 'state'] = 'Traded Away'
display(dfRFQ_Breakdown_By_Done_Traded_Away_Grp.sort_values('security_type1',ascending=True))
更新后的字符串似乎按 pandas:
单独分组state security_type1 count
Done CORP 239
Done CORP 9
Traded Away CORP 7
Traded Away CORP 1075
Done GOVT 40
Done GOVT 2
Traded Away GOVT 16
Traded Away GOVT 150
Done MTGE 4
Traded Away MTGE 3
Traded Away MTGE 7
Done SUPRA 31
Traded Away SUPRA 88
pandas 没有将 Done 和 Traded Away 的实例组合在一起的原因是什么?我是否需要创建数据框的另一个副本。几乎就像 pandas 对更新前的旧值有一个 link。
query
, groupby
and sort_values
这似乎是可能的:
res = df.query('(state == "Done") | (state == "TradedAway")')\
.groupby(['state', 'security_type1'], as_index=False)['count'].sum()\
.sort_values(['security_type1', 'state'])
print(res)
state security_type1 count
0 Done CORP 239
4 TradedAway CORP 1075
1 Done GOVT 40
5 TradedAway GOVT 150
2 Done MTGE 4
6 TradedAway MTGE 7
3 Done SUPRA 31
7 TradedAway SUPRA 88