查询 panda df 以过滤列不是 Nan 的行
querying panda df to filter rows where a column is not Nan
我是 python 的新手,正在使用 pandas。
我想查询数据框并过滤其中一列不是 NaN
的行。
我试过:
a=dictionarydf.label.isnull()
但是 a 填充了 true
或 false
。
试过这个
dictionarydf.query(dictionarydf.label.isnull())
但如我所料报错
示例数据:
reference_word all_matching_words label review
0 account fees - account NaN N
1 account mobile - account NaN N
2 account monthly - account NaN N
3 administration delivery - administration NaN N
4 administration fund - administration NaN N
5 advisor fees - advisor NaN N
6 advisor optimum - advisor NaN N
7 advisor sub - advisor NaN N
8 aichi delivery - aichi NaN N
9 aichi pref - aichi NaN N
10 airport biz - airport travel N
11 airport cfo - airport travel N
12 airport cfomtg - airport travel N
13 airport meeting - airport travel N
14 airport summit - airport travel N
15 airport taxi - airport travel N
16 airport train - airport travel N
17 airport transfer - airport travel N
18 airport trip - airport travel N
19 ais admin - ais NaN N
20 ais alpine - ais NaN N
21 ais fund - ais NaN N
22 allegiance custody - allegiance NaN N
23 allegiance fees - allegiance NaN N
24 alpha late - alpha NaN N
25 alpha meal - alpha NaN N
26 alpha taxi - alpha NaN N
27 alpine admin - alpine NaN N
28 alpine ais - alpine NaN N
29 alpine fund - alpine NaN N
我想过滤label不是NaN的数据
预期输出:
reference_word all_matching_words label review
0 airport biz - airport travel N
1 airport cfo - airport travel N
2 airport cfomtg - airport travel N
3 airport meeting - airport travel N
4 airport summit - airport travel N
5 airport taxi - airport travel N
6 airport train - airport travel N
7 airport transfer - airport travel N
8 airport trip - airport travel N
您可以使用 dropna
:
df = df.dropna(subset=['label'])
print (df)
reference_word all_matching_words label review
10 airport biz - airport travel N
11 airport cfo - airport travel N
12 airport cfomtg - airport travel N
13 airport meeting - airport travel N
14 airport summit - airport travel N
15 airport taxi - airport travel N
16 airport train - airport travel N
17 airport transfer - airport travel N
18 airport trip - airport travel N
另一个解决方案 - boolean indexing
with notnull
:
df = df[df.label.notnull()]
print (df)
reference_word all_matching_words label review
10 airport biz - airport travel N
11 airport cfo - airport travel N
12 airport cfomtg - airport travel N
13 airport meeting - airport travel N
14 airport summit - airport travel N
15 airport taxi - airport travel N
16 airport train - airport travel N
17 airport transfer - airport travel N
18 airport trip - airport travel N
我是 python 的新手,正在使用 pandas。
我想查询数据框并过滤其中一列不是 NaN
的行。
我试过:
a=dictionarydf.label.isnull()
但是 a 填充了 true
或 false
。
试过这个
dictionarydf.query(dictionarydf.label.isnull())
但如我所料报错
示例数据:
reference_word all_matching_words label review
0 account fees - account NaN N
1 account mobile - account NaN N
2 account monthly - account NaN N
3 administration delivery - administration NaN N
4 administration fund - administration NaN N
5 advisor fees - advisor NaN N
6 advisor optimum - advisor NaN N
7 advisor sub - advisor NaN N
8 aichi delivery - aichi NaN N
9 aichi pref - aichi NaN N
10 airport biz - airport travel N
11 airport cfo - airport travel N
12 airport cfomtg - airport travel N
13 airport meeting - airport travel N
14 airport summit - airport travel N
15 airport taxi - airport travel N
16 airport train - airport travel N
17 airport transfer - airport travel N
18 airport trip - airport travel N
19 ais admin - ais NaN N
20 ais alpine - ais NaN N
21 ais fund - ais NaN N
22 allegiance custody - allegiance NaN N
23 allegiance fees - allegiance NaN N
24 alpha late - alpha NaN N
25 alpha meal - alpha NaN N
26 alpha taxi - alpha NaN N
27 alpine admin - alpine NaN N
28 alpine ais - alpine NaN N
29 alpine fund - alpine NaN N
我想过滤label不是NaN的数据
预期输出:
reference_word all_matching_words label review
0 airport biz - airport travel N
1 airport cfo - airport travel N
2 airport cfomtg - airport travel N
3 airport meeting - airport travel N
4 airport summit - airport travel N
5 airport taxi - airport travel N
6 airport train - airport travel N
7 airport transfer - airport travel N
8 airport trip - airport travel N
您可以使用 dropna
:
df = df.dropna(subset=['label'])
print (df)
reference_word all_matching_words label review
10 airport biz - airport travel N
11 airport cfo - airport travel N
12 airport cfomtg - airport travel N
13 airport meeting - airport travel N
14 airport summit - airport travel N
15 airport taxi - airport travel N
16 airport train - airport travel N
17 airport transfer - airport travel N
18 airport trip - airport travel N
另一个解决方案 - boolean indexing
with notnull
:
df = df[df.label.notnull()]
print (df)
reference_word all_matching_words label review
10 airport biz - airport travel N
11 airport cfo - airport travel N
12 airport cfomtg - airport travel N
13 airport meeting - airport travel N
14 airport summit - airport travel N
15 airport taxi - airport travel N
16 airport train - airport travel N
17 airport transfer - airport travel N
18 airport trip - airport travel N