Python Pandas 根据列值过滤行 returns NaN
Python Pandas filtering rows based on column value returns NaN
我有以下字典,其中包含一些超参数设置和相应的结果:
data_ dict = {('Splits',): {0: 0, 1: 0, 2: 0, 3: 0},
('Weights',): {0: 'uniform', 1: 'uniform', 2: 'distance', 3: 'distance'},
('K_neighbors',): {0: 1, 1: 2, 2: 1, 3: 2},
('Accuracy',): {0: 0.69, 1: 0.721, 2: 0.69, 3: 0.713},
('AUROC',): {0: 0.558, 1: 0.524, 2: 0.558, 3: 0.532},
('Prec_0',): {0: 0.77, 1: 0.753, 2: 0.77, 3: 0.756},
('Prec_1',): {0: 0.368, 1: 0.369, 2: 0.368, 3: 0.366},
('Rec_0',): {0: 0.831, 1: 0.929, 2: 0.831, 3: 0.904},
('Rec_1',): {0: 0.285, 1: 0.119, 2: 0.285, 3: 0.159},
('f1_0',): {0: 0.799, 1: 0.832, 2: 0.799, 3: 0.824},
('f1_1',): {0: 0.321, 1: 0.18, 2: 0.321, 3: 0.222}}
然后我将这个列表转换为 pandas DataFrame:
results = pd.DataFrame(dicta)
其中returns以下
Splits Weights K_neighbors Accuracy AUROC Prec_0 Prec_1 Rec_0 Rec_1 f1_0 f1_1
0 0 uniform 1 0.690 0.558 0.770 0.368 0.831 0.285 0.799 0.321
1 0 uniform 2 0.721 0.524 0.753 0.369 0.929 0.119 0.832 0.180
2 0 distance 1 0.690 0.558 0.770 0.368 0.831 0.285 0.799 0.321
3 0 distance 2 0.713 0.532 0.756 0.366 0.904 0.159 0.824 0.222
现在我尝试过滤包含超参数 Weight 仅等于 'uniform':
的行
results[(results['Weights']=='uniform')]
但是,返回的 DataFrame 具有所有值,除了我们过滤为等于 Nan 的值:
Splits Weights K_neighbors Accuracy AUROC Prec_0 Prec_1 Rec_0 Rec_1 f1_0 f1_1
0 NaN uniform NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN uniform NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
然而,代码的期望输出是:
Splits Weights K_neighbors Accuracy AUROC Prec_0 Prec_1 Rec_0 Rec_1 f1_0 f1_1
0 0 uniform 1 0.690 0.558 0.770 0.368 0.831 0.285 0.799 0.321
1 0 uniform 2 0.721 0.524 0.753 0.369 0.929 0.119 0.832 0.180
已确定问题,您的列是多索引,这就是此问题背后的原因。
请按如下方式重命名列:
results.columns = [col[0] for col in results.columns]
results[results['Weights']=='uniform']
然后试试看,会成功的。
我有以下字典,其中包含一些超参数设置和相应的结果:
data_ dict = {('Splits',): {0: 0, 1: 0, 2: 0, 3: 0},
('Weights',): {0: 'uniform', 1: 'uniform', 2: 'distance', 3: 'distance'},
('K_neighbors',): {0: 1, 1: 2, 2: 1, 3: 2},
('Accuracy',): {0: 0.69, 1: 0.721, 2: 0.69, 3: 0.713},
('AUROC',): {0: 0.558, 1: 0.524, 2: 0.558, 3: 0.532},
('Prec_0',): {0: 0.77, 1: 0.753, 2: 0.77, 3: 0.756},
('Prec_1',): {0: 0.368, 1: 0.369, 2: 0.368, 3: 0.366},
('Rec_0',): {0: 0.831, 1: 0.929, 2: 0.831, 3: 0.904},
('Rec_1',): {0: 0.285, 1: 0.119, 2: 0.285, 3: 0.159},
('f1_0',): {0: 0.799, 1: 0.832, 2: 0.799, 3: 0.824},
('f1_1',): {0: 0.321, 1: 0.18, 2: 0.321, 3: 0.222}}
然后我将这个列表转换为 pandas DataFrame:
results = pd.DataFrame(dicta)
其中returns以下
Splits Weights K_neighbors Accuracy AUROC Prec_0 Prec_1 Rec_0 Rec_1 f1_0 f1_1
0 0 uniform 1 0.690 0.558 0.770 0.368 0.831 0.285 0.799 0.321
1 0 uniform 2 0.721 0.524 0.753 0.369 0.929 0.119 0.832 0.180
2 0 distance 1 0.690 0.558 0.770 0.368 0.831 0.285 0.799 0.321
3 0 distance 2 0.713 0.532 0.756 0.366 0.904 0.159 0.824 0.222
现在我尝试过滤包含超参数 Weight 仅等于 'uniform':
的行results[(results['Weights']=='uniform')]
但是,返回的 DataFrame 具有所有值,除了我们过滤为等于 Nan 的值:
Splits Weights K_neighbors Accuracy AUROC Prec_0 Prec_1 Rec_0 Rec_1 f1_0 f1_1
0 NaN uniform NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN uniform NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
然而,代码的期望输出是:
Splits Weights K_neighbors Accuracy AUROC Prec_0 Prec_1 Rec_0 Rec_1 f1_0 f1_1
0 0 uniform 1 0.690 0.558 0.770 0.368 0.831 0.285 0.799 0.321
1 0 uniform 2 0.721 0.524 0.753 0.369 0.929 0.119 0.832 0.180
已确定问题,您的列是多索引,这就是此问题背后的原因。 请按如下方式重命名列:
results.columns = [col[0] for col in results.columns]
results[results['Weights']=='uniform']
然后试试看,会成功的。