Groupby and sort multiple columns' values raising an AttributeError: 'DataFrameGroupBy' object has no attribute 'sort_values'
Groupby and sort multiple columns' values raising an AttributeError: 'DataFrameGroupBy' object has no attribute 'sort_values'
对于下面的玩具数据集,我正在尝试按 target_name
分组并按多列对值进行排序:valid_mse
、valid_r2_score
使用:df.groupby('target_name').sort_values(by=['valid_mse', 'valid_r2_score'], ascending=[True, False])
target_name train_mse valid_mse train_r2_score valid_r2_score
0 CPI 1.102079 1.842212 0.947458 -0.624665
1 CPI 1.301734 1.890085 0.928005 -0.777463
2 CPI 0.471222 1.078413 0.990599 0.311849
3 PPI 0.113998 0.135523 0.662532 0.262387
4 PPI 0.095434 0.176431 0.752242 -0.422994
5 PPI 0.097648 0.174544 0.744522 -0.203880
但它引发了一个错误:AttributeError: 'DataFrameGroupBy' object has no attribute 'sort_values'
。我还尝试使用 df.groupby('target_name').sort_values(by='valid_mse', ascending=True)
对一列进行排序,它会引发相同的错误。
有谁知道我怎样才能正确解决这个问题?谢谢。
字典格式的数据:
{'target_name': {0: 'CPI', 1: 'CPI', 2: 'CPI', 3: 'PPI', 4: 'PPI', 5: 'PPI'},
'train_mse': {0: 1.102079409,
1: 1.301734392,
2: 0.471221642,
3: 0.11399796,
4: 0.09543417,
5: 0.097647639},
'valid_mse': {0: 1.842212034,
1: 1.890085418,
2: 1.078413107,
3: 0.135523283,
4: 0.176431247,
5: 0.174543796},
'train_r2_score': {0: 0.947458162,
1: 0.928005473,
2: 0.990599137,
3: 0.662532128,
4: 0.752241595,
5: 0.744522334},
'valid_r2_score': {0: -0.624665246,
1: -0.777462993,
2: 0.311849214,
3: 0.262387135,
4: -0.422993602,
5: -0.203880075}}
引用link:
How to sort a dataFrame in python pandas by two or more columns?
groupBy
中没有sort_values
(groupby创建的对象)
简单的三列排序不就可以得到想要的数据吗?类似于:
df.sort_values(by=['target_name', 'valid_mse', 'valid_r2_score'],
ascending=[True, True, False])
这将首先按 target_name 列排序,然后按 valid_mse 排序,然后按 valid_r2_score 排序,因此可以说这就是你想要的:
target_name train_mse valid_mse train_r2_score valid_r2_score
2 CPI 0.471222 1.078413 0.990599 0.311849
0 CPI 1.102079 1.842212 0.947458 -0.624665
1 CPI 1.301734 1.890085 0.928005 -0.777463
3 PPI 0.113998 0.135523 0.662532 0.262387
5 PPI 0.097648 0.174544 0.744522 -0.203880
4 PPI 0.095434 0.176431 0.752242 -0.422994
对于下面的玩具数据集,我正在尝试按 target_name
分组并按多列对值进行排序:valid_mse
、valid_r2_score
使用:df.groupby('target_name').sort_values(by=['valid_mse', 'valid_r2_score'], ascending=[True, False])
target_name train_mse valid_mse train_r2_score valid_r2_score
0 CPI 1.102079 1.842212 0.947458 -0.624665
1 CPI 1.301734 1.890085 0.928005 -0.777463
2 CPI 0.471222 1.078413 0.990599 0.311849
3 PPI 0.113998 0.135523 0.662532 0.262387
4 PPI 0.095434 0.176431 0.752242 -0.422994
5 PPI 0.097648 0.174544 0.744522 -0.203880
但它引发了一个错误:AttributeError: 'DataFrameGroupBy' object has no attribute 'sort_values'
。我还尝试使用 df.groupby('target_name').sort_values(by='valid_mse', ascending=True)
对一列进行排序,它会引发相同的错误。
有谁知道我怎样才能正确解决这个问题?谢谢。
字典格式的数据:
{'target_name': {0: 'CPI', 1: 'CPI', 2: 'CPI', 3: 'PPI', 4: 'PPI', 5: 'PPI'},
'train_mse': {0: 1.102079409,
1: 1.301734392,
2: 0.471221642,
3: 0.11399796,
4: 0.09543417,
5: 0.097647639},
'valid_mse': {0: 1.842212034,
1: 1.890085418,
2: 1.078413107,
3: 0.135523283,
4: 0.176431247,
5: 0.174543796},
'train_r2_score': {0: 0.947458162,
1: 0.928005473,
2: 0.990599137,
3: 0.662532128,
4: 0.752241595,
5: 0.744522334},
'valid_r2_score': {0: -0.624665246,
1: -0.777462993,
2: 0.311849214,
3: 0.262387135,
4: -0.422993602,
5: -0.203880075}}
引用link:
How to sort a dataFrame in python pandas by two or more columns?
groupBy
中没有sort_values
(groupby创建的对象)
简单的三列排序不就可以得到想要的数据吗?类似于:
df.sort_values(by=['target_name', 'valid_mse', 'valid_r2_score'],
ascending=[True, True, False])
这将首先按 target_name 列排序,然后按 valid_mse 排序,然后按 valid_r2_score 排序,因此可以说这就是你想要的:
target_name train_mse valid_mse train_r2_score valid_r2_score
2 CPI 0.471222 1.078413 0.990599 0.311849
0 CPI 1.102079 1.842212 0.947458 -0.624665
1 CPI 1.301734 1.890085 0.928005 -0.777463
3 PPI 0.113998 0.135523 0.662532 0.262387
5 PPI 0.097648 0.174544 0.744522 -0.203880
4 PPI 0.095434 0.176431 0.752242 -0.422994