如何创建一个新列，其中包含预定义列表中的单词出现在数据框文本列中的次数计数器？

Question

我想建立一个新列，其中包含 ai_functional 列表中的单词在文本列中出现的次数。

给出的列表是：

> ai_functional = ["natural language
> processing","nlp","A I ","Aritificial intelligence", "stemming","lemmatization","lemmatization","information
> extraction","text mining","text analytics","data-mining"]

我想要的结果如下：

> text                                                                     counter
> 
> 1. More details  A I   Artificial Intelligence                             2
> 2. NLP works very well these days                                          1         
> 3. receiving information at the right time                                 1

我一直在使用的代码是

def func(stringans):
  for x in ai_tech:
    count = stringans.count(x)
  
  return count

df['counter']=df['text'].apply(func)

有人可以帮我解决这个问题吗？我真的被卡住了，因为每次我应用它时，我在计数器列中得到的结果为 0

Answer 1

正如你所做的那样count = ，你擦除之前的值，你想要总结不同的计数

def func(stringans):
    count = 0
    for x in ai_tech:
        count += stringans.count(x)
    return count

# with sum and generator 
def func(stringans):
    return sum(stringans.count(x) for x in ai_tech)

修正 ai_tech 中的一些拼写错误并将所有设置为 .lower() 在计数器列中得到 2,1,0，最后一行没有共同的值

import pandas as pd

ai_tech = ["natural language processing", "nlp", "A I ", "Artificial intelligence",
           "stemming", "lemmatization", "information extraction",
           "text mining", "text analytics", "data - mining"]

df = pd.DataFrame([["1. More details  A I   Artificial Intelligence"], ["2. NLP works very well these days"],
                   ["3. receiving information at the right time"]], columns=["text"])

def func(stringans):
    return sum(stringans.lower().count(x.lower()) for x in ai_tech)

df['counter'] = df['text'].apply(func)
print(df)

# ------------------
                                             text  counter
0  1. More details  A I   Artificial Intelligence        2
1               2. NLP works very well these days        1
2      3. receiving information at the right time        0

如何创建一个新列，其中包含预定义列表中的单词出现在数据框文本列中的次数计数器？

How to make a new column with counter of the number of times a word from a predefined list appears in a text column of the dataframe?

python

counter

data-manipulation

dataframe

pandas