Python Pandas 根据列值过滤行 returns NaN

Python Pandas filtering rows based on column value returns NaN

我有以下字典,其中包含一些超参数设置和相应的结果:

data_ dict = {('Splits',): {0: 0, 1: 0, 2: 0, 3: 0},
 ('Weights',): {0: 'uniform', 1: 'uniform', 2: 'distance', 3: 'distance'},
 ('K_neighbors',): {0: 1, 1: 2, 2: 1, 3: 2},
 ('Accuracy',): {0: 0.69, 1: 0.721, 2: 0.69, 3: 0.713},
 ('AUROC',): {0: 0.558, 1: 0.524, 2: 0.558, 3: 0.532},
 ('Prec_0',): {0: 0.77, 1: 0.753, 2: 0.77, 3: 0.756},
 ('Prec_1',): {0: 0.368, 1: 0.369, 2: 0.368, 3: 0.366},
 ('Rec_0',): {0: 0.831, 1: 0.929, 2: 0.831, 3: 0.904},
 ('Rec_1',): {0: 0.285, 1: 0.119, 2: 0.285, 3: 0.159},
 ('f1_0',): {0: 0.799, 1: 0.832, 2: 0.799, 3: 0.824},
 ('f1_1',): {0: 0.321, 1: 0.18, 2: 0.321, 3: 0.222}}

然后我将这个列表转换为 pandas DataFrame:

results = pd.DataFrame(dicta)

其中returns以下


Splits  Weights K_neighbors Accuracy    AUROC   Prec_0  Prec_1  Rec_0   Rec_1   f1_0    f1_1
0   0   uniform     1       0.690       0.558   0.770   0.368   0.831   0.285   0.799   0.321
1   0   uniform     2       0.721       0.524   0.753   0.369   0.929   0.119   0.832   0.180
2   0   distance    1       0.690       0.558   0.770   0.368   0.831   0.285   0.799   0.321
3   0   distance    2       0.713       0.532   0.756   0.366   0.904   0.159   0.824   0.222

现在我尝试过滤包含超参数 Weight 仅等于 'uniform':

的行
results[(results['Weights']=='uniform')]

但是,返回的 DataFrame 具有所有值,除了我们过滤为等于 Nan 的值:

    Splits  Weights K_neighbors Accuracy    AUROC   Prec_0  Prec_1  Rec_0   Rec_1   f1_0    f1_1
0   NaN     uniform NaN         NaN         NaN     NaN     NaN     NaN     NaN     NaN     NaN
1   NaN     uniform NaN         NaN         NaN     NaN     NaN     NaN     NaN     NaN     NaN
2   NaN     NaN     NaN         NaN         NaN     NaN     NaN     NaN     NaN     NaN     NaN
3   NaN     NaN     NaN         NaN         NaN     NaN     NaN     NaN     NaN     NaN     NaN

然而,代码的期望输出是:


Splits  Weights K_neighbors Accuracy    AUROC   Prec_0  Prec_1  Rec_0   Rec_1   f1_0    f1_1
0   0   uniform     1       0.690       0.558   0.770   0.368   0.831   0.285   0.799   0.321
1   0   uniform     2       0.721       0.524   0.753   0.369   0.929   0.119   0.832   0.180

已确定问题,您的列是多索引,这就是此问题背后的原因。 请按如下方式重命名列:

results.columns = [col[0] for col in results.columns]

results[results['Weights']=='uniform']

然后试试看,会成功的。