Python Pandas:如果字符串值列表 == [none],则从数据框中删除行
Python Pandas: Drop rows from data frame if list of string value == [none]
我的数据框中有一列包含值列表。
Tags
[marvel, comics, comic, books, nerdy]
[new, snapchat, version, snap, inc]
[none]
[new, york, times, ny, times, nyt, times]
[today, show, today, show, today]
[none]
[mark, wahlberg, marky, mark]
我不知道如何从数据框中删除这个 [none] 列表。我试过了,
us_videos = us_videos.drop(us_videos.index[us_videos.tags == 'none'])
但这只有在我将列转换为字符串时才有效。如何实现?
首先让我们编写一个函数来删除列表中的 'none'
:
print(df)
tags
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
2 [none]
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today, none]
def delete_none(element):
new = []
for val in element:
if val != 'none':
new.append(val)
if len(new) == 0:
return np.nan
else:
return new
现在我们将这个函数应用于 tags
列:
df.tags.apply(delete_none)
输出:
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
2 NaN
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today]
新答案
OP 想从子列表中删除 'none'
并删除只有 'none'
的行
us_videos.tags.explode().pipe(lambda s: s[s != 'none']).groupby(level=0).agg(list)
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today]
6 [mark, wahlberg, marky, mark]
Name: tags, dtype: object
更像 pythonic 的方式
dat = {}
for k, v in us_videos.tags.iteritems():
for x in v:
if x != 'none':
dat.setdefault(k, []).append(x)
pd.Series(dat, name='tags')
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today]
6 [mark, wahlberg, marky, mark]
Name: tags, dtype: object
理解中有赋值表达式
pd.Series({
k: X for k, v in us_videos.tags.iteritems()
if (X:=[*filter('none'.__ne__, v)])
}, name='tags')
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today]
6 [mark, wahlberg, marky, mark]
Name: tags, dtype: object
旧答案
explode
us_videos[us_videos.tags.explode().ne('none').any(level=0)]
tags
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today]
6 [mark, wahlberg, marky, mark]
list.__ne__
us_videos[us_videos.tags.map(['none'].__ne__)]
tags
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today]
6 [mark, wahlberg, marky, mark]
我的数据框中有一列包含值列表。
Tags
[marvel, comics, comic, books, nerdy]
[new, snapchat, version, snap, inc]
[none]
[new, york, times, ny, times, nyt, times]
[today, show, today, show, today]
[none]
[mark, wahlberg, marky, mark]
我不知道如何从数据框中删除这个 [none] 列表。我试过了,
us_videos = us_videos.drop(us_videos.index[us_videos.tags == 'none'])
但这只有在我将列转换为字符串时才有效。如何实现?
首先让我们编写一个函数来删除列表中的 'none'
:
print(df)
tags
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
2 [none]
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today, none]
def delete_none(element):
new = []
for val in element:
if val != 'none':
new.append(val)
if len(new) == 0:
return np.nan
else:
return new
现在我们将这个函数应用于 tags
列:
df.tags.apply(delete_none)
输出:
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
2 NaN
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today]
新答案
OP 想从子列表中删除 'none'
并删除只有 'none'
us_videos.tags.explode().pipe(lambda s: s[s != 'none']).groupby(level=0).agg(list)
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today]
6 [mark, wahlberg, marky, mark]
Name: tags, dtype: object
更像 pythonic 的方式
dat = {}
for k, v in us_videos.tags.iteritems():
for x in v:
if x != 'none':
dat.setdefault(k, []).append(x)
pd.Series(dat, name='tags')
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today]
6 [mark, wahlberg, marky, mark]
Name: tags, dtype: object
理解中有赋值表达式
pd.Series({
k: X for k, v in us_videos.tags.iteritems()
if (X:=[*filter('none'.__ne__, v)])
}, name='tags')
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today]
6 [mark, wahlberg, marky, mark]
Name: tags, dtype: object
旧答案
explode
us_videos[us_videos.tags.explode().ne('none').any(level=0)]
tags
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today]
6 [mark, wahlberg, marky, mark]
list.__ne__
us_videos[us_videos.tags.map(['none'].__ne__)]
tags
0 [marvel, comics, comic, books, nerdy]
1 [new, snapchat, version, snap, inc]
3 [new, york, times, ny, times, nyt, times]
4 [today, show, today, show, today]
6 [mark, wahlberg, marky, mark]