列大于阈值

Question

如何检索其中 至少一次 出现值 < 阈值的列？

例如：

THRESHOLD = 0

print(df)

Col_1  Col_2  Col_3   Col_4
   1     3      5      -9
   1     3      5      -9
   1    -2      5      -9

打印(final_df)

  Col_2    Col_4
     3      -9
     3      -9
    -2      -9

我试过：

df[(df < 0).any(1)]

但它报告的是行，而不是列，其中至少有一个元素 < 0 出现。

Answer 1

将 axis=0 与 .loc

结合使用

df.loc[:,(df < 0).any(0)]
Out[215]: 
   Col_2  Col_4
0      3     -9
1      3     -9
2     -2     -9

或者我们使用 .iloc 和 nonzero

df.iloc[:,(df<0).any().nonzero()[0]]
Out[230]: 
   Col_2  Col_4
0      3     -9
1      3     -9
2     -2     -9

Answer 2

你可以发出df.loc[:, (df < 0).any(0)].

>>> df                                                                                                                       
   Col_1  Col_2  Col_3  Col_4
0      1      3      5     -9
1      1      3      5     -9
2      1     -2      5     -9
>>>
>>> df.loc[:, (df < 0).any(0)] 
   Col_2  Col_4
0      3     -9
1      3     -9
2     -2     -9

详情：

(df < 0).any(0) 将为您提供值小于零的列，因为 any(0) 操作 along 行。

>>> df < 0                                                                                                                    
   Col_1  Col_2  Col_3  Col_4
0  False  False  False   True
1  False  False  False   True
2  False   True  False   True
>>>
>>> (df < 0).any(0)                                                                                                            
Col_1    False
Col_2     True
Col_3    False
Col_4     True
dtype: bool

然后df.loc[:, (df < 0).any(0)]通过布尔索引选择df < 0).any(0)为True的所有行和列。

列大于阈值

Columns greater than a threshold

python

threshold

pandas