根据字典在 csv 中搜索单词

Search for words inside a csv based on a dictionary

假设我有一个字典 csv 例如:

fruit   vegetable   meat
banana  broccoli    beef
apple   carrot      chicken
orange  corn        pork
mango   NaN         NaN
coconut NaN         NaN

和另一个 csv,例如:

sentences
Today I ate some beef.
Corn is tasty.
I drank some coconut water.

我正在尝试将句子 csv 中的字符串与字典中的字符串进行匹配以进行分类:

sentences                   food  
Today I ate some beef.      meat 
Corn is tasty.              vegetable
I drank some coconut water. fruit

我需要做什么才能产生该输出?我应该消除 NaN 以使其正常工作还是可以忽略它们?

多种方式,随心所欲。这是一个嵌套的 for 循环。我很确定您可以执行递归方法甚至列表理解。

fruit = ["banana","apple",...]
meat = ["beef","chicken",...]
vegetable = ["corn","brocolli"] #Nan will simply be ignored
classes = [fruit,meat,vegtable]
#Now you have your 'classes' of strings.

sentences = ["Today I ate some beef.",...]#Here are your list of sentences.
output = []
for sentence in sentences: #for each sentence
   for food_type in classes: # we check if it exists in each class
       for food in food_type: # we check each food of each class
           if food in sentence: #if that string is in the sentence we pair it into a tuple
               output.append((sentence,food_type))
               

这仅在字符串准确时才匹配(区分大小写)。还有一些警告可能会出现错误 类,例如如果您有“straw”和“strawberry”。

另请查看此 link 到 'read' 列表的 csv。