如何显示数据集中的特定单词？

Question

刚开始学习python。我有一个关于在 excel.

中匹配我的数据集中的一些词的问题

words_list 包含了一些我想在数据集中找到的词。

words_list = ('tried','mobile','abc')

df 是来自 excel 的摘录，并选取了一个列。

df =

0        to make it possible or easier for someone to do ...  
1        unable to acquire a buffer item very likely ...  
2        The organization has tried to make...  
3        Broadway tried a variety of mobile Phone for the..

我想得到这样的结果：

'None',
'None',
'tried',
'tried','mobile'

我在 Jupiter 中这样试过：

list = [ ]
for word in df: 
    if any (aa in word for aa in words_List): 
        list.append(word) 
    else:
        list.append('None')

print(list)

但是结果会在df中显示整个句子

'None'  
'None'  
'The organization has tried to make...'  
'Broadway tried a variety of mobile Phone for the..'

我可以只在单词列表中显示结果吗？对不起我的英语和
谢谢大家

Answer 1

我建议对 DataFrame 进行操作（这应该始终是您的第一个想法，使用 pandas 的力量）

import pandas as pd

words_list = {'tried', 'mobile', 'abc'}

df = pd.DataFrame({'col': ['to make it possible or easier for someone to do',
                           'unable to acquire a buffer item very likely',
                           'The organization has tried to make',
                           'Broadway tried a variety of mobile Phone for the']})

df['matches'] = df['col'].str.split().apply(lambda x: set(x) & words_list)
print(df)


                                                col          matches
0   to make it possible or easier for someone to do               {}
1       unable to acquire a buffer item very likely               {}
2                The organization has tried to make          {tried}
3  Broadway tried a variety of mobile Phone for the  {mobile, tried}

Answer 2

它打印整行的原因与您有关：

for word in df:

你的“word”变量实际上占了整行。然后它检查整行以查看它是否包含您的搜索词。如果它确实找到了它，那么它基本上会说，“是的，我在这一行中找到了 ____，因此将该行添加到您的列表中。

听起来你想做的是先将行拆分成单词，然后再检查。

list = [ ]
found = False

for line in df:
    words = line.split(" ") 
    for word in word_list:
       if word in words:
          found = True
          list.append(word)
    # this is just to append "None" if nothing found
    if found:
       found = False
    else:
       list.append("None")
        
print(list)

附带说明一下，在处理列表时，您可能希望使用 pprint 而不是 print。它以更易于阅读的布局打印列表、词典等。我不知道你是否需要安装这个包。这取决于您最初的安装方式 python。但用法是这样的：

from pprint import pprint

dictionary = {'firstkey':'firstval','secondkey':'secondval','thirdkey':'thirdval'}

pprint(dictionary)

如何显示数据集中的特定单词？

How can I show a specific word in a data set?

python

list

find

python-3.x