将 DataFrame 过滤为包含 2 个以上 True 元素的行

Filter DataFrame to rows with 2+ True elements

例如,使用

df[(df>1).any(1)]

我可以获取任何大于1的元素的数据,但是如果我想获取至少2个大于1的元素的数据,我该怎么做呢? Thx

试试这个:

df[(df>1).sum(1).gt(1)]

演示:

import string

In [118]: df = pd.DataFrame(np.random.rand(10,10)*1.2, columns=list(string.ascii_letters[:10]))

In [119]: df
Out[119]:
          a         b         c         d         e         f         g         h         i         j
0  0.934290  0.426050  0.165846  1.114521  1.101023  0.924071  0.241893  0.890354  1.168406  0.506547
1  0.576869  1.091996  0.272124  0.834070  0.229545  0.585501  1.114688  0.957817  1.151957  0.761277
2  0.016659  1.138262  0.481773  0.186753  0.176585  0.497437  0.321805  0.664140  0.738851  0.177179
3  0.192605  0.395377  0.950169  0.678960  0.525349  0.050877  0.181615  0.105080  0.385672  0.401810
4  1.184054  1.097378  0.197706  0.453395  0.258631  1.088337  0.139201  0.217262  0.369734  1.054716
5  0.246081  0.234748  0.879371  0.198397  0.288288  0.534848  0.561080  0.732490  0.156947  0.662194
6  0.660215  0.221513  0.224576  0.049425  0.339101  0.441393  1.122385  0.057968  1.094025  1.130691
7  0.022977  0.681718  0.314200  0.622263  0.692124  0.803743  0.783381  0.715494  0.434911  0.247724
8  0.815742  0.419933  0.019704  0.764557  0.074530  0.990639  0.801125  0.403838  0.680618  1.043551
9  1.061915  0.229453  0.446562  0.324415  0.121421  0.270542  0.884124  0.926168  0.282650  0.267467

In [120]: df[(df>1).sum(1).gt(1)]
Out[120]:
          a         b         c         d         e         f         g         h         i         j
0  0.934290  0.426050  0.165846  1.114521  1.101023  0.924071  0.241893  0.890354  1.168406  0.506547
1  0.576869  1.091996  0.272124  0.834070  0.229545  0.585501  1.114688  0.957817  1.151957  0.761277
4  1.184054  1.097378  0.197706  0.453395  0.258631  1.088337  0.139201  0.217262  0.369734  1.054716
6  0.660215  0.221513  0.224576  0.049425  0.339101  0.441393  1.122385  0.057968  1.094025  1.130691