如何创建一个新列,其中包含预定义列表中的单词出现在数据框文本列中的次数计数器?
How to make a new column with counter of the number of times a word from a predefined list appears in a text column of the dataframe?
我想建立一个新列,其中包含 ai_functional 列表中的单词在文本列中出现的次数。
给出的列表是:
> ai_functional = ["natural language
> processing","nlp","A I ","Aritificial intelligence", "stemming","lemmatization","lemmatization","information
> extraction","text mining","text analytics","data-mining"]
我想要的结果如下:
> text counter
>
> 1. More details A I Artificial Intelligence 2
> 2. NLP works very well these days 1
> 3. receiving information at the right time 1
我一直在使用的代码是
def func(stringans):
for x in ai_tech:
count = stringans.count(x)
return count
df['counter']=df['text'].apply(func)
有人可以帮我解决这个问题吗?我真的被卡住了,因为每次我应用它时,我在计数器列中得到的结果为 0
正如你所做的那样count =
,你擦除之前的值,你想要总结不同的计数
def func(stringans):
count = 0
for x in ai_tech:
count += stringans.count(x)
return count
# with sum and generator
def func(stringans):
return sum(stringans.count(x) for x in ai_tech)
修正 ai_tech
中的一些拼写错误并将所有设置为 .lower()
在计数器列中得到 2,1,0
,最后一行没有共同的值
import pandas as pd
ai_tech = ["natural language processing", "nlp", "A I ", "Artificial intelligence",
"stemming", "lemmatization", "information extraction",
"text mining", "text analytics", "data - mining"]
df = pd.DataFrame([["1. More details A I Artificial Intelligence"], ["2. NLP works very well these days"],
["3. receiving information at the right time"]], columns=["text"])
def func(stringans):
return sum(stringans.lower().count(x.lower()) for x in ai_tech)
df['counter'] = df['text'].apply(func)
print(df)
# ------------------
text counter
0 1. More details A I Artificial Intelligence 2
1 2. NLP works very well these days 1
2 3. receiving information at the right time 0
我想建立一个新列,其中包含 ai_functional 列表中的单词在文本列中出现的次数。
给出的列表是:
> ai_functional = ["natural language
> processing","nlp","A I ","Aritificial intelligence", "stemming","lemmatization","lemmatization","information
> extraction","text mining","text analytics","data-mining"]
我想要的结果如下:
> text counter
>
> 1. More details A I Artificial Intelligence 2
> 2. NLP works very well these days 1
> 3. receiving information at the right time 1
我一直在使用的代码是
def func(stringans):
for x in ai_tech:
count = stringans.count(x)
return count
df['counter']=df['text'].apply(func)
有人可以帮我解决这个问题吗?我真的被卡住了,因为每次我应用它时,我在计数器列中得到的结果为 0
正如你所做的那样count =
,你擦除之前的值,你想要总结不同的计数
def func(stringans):
count = 0
for x in ai_tech:
count += stringans.count(x)
return count
# with sum and generator
def func(stringans):
return sum(stringans.count(x) for x in ai_tech)
修正 ai_tech
中的一些拼写错误并将所有设置为 .lower()
在计数器列中得到 2,1,0
,最后一行没有共同的值
import pandas as pd
ai_tech = ["natural language processing", "nlp", "A I ", "Artificial intelligence",
"stemming", "lemmatization", "information extraction",
"text mining", "text analytics", "data - mining"]
df = pd.DataFrame([["1. More details A I Artificial Intelligence"], ["2. NLP works very well these days"],
["3. receiving information at the right time"]], columns=["text"])
def func(stringans):
return sum(stringans.lower().count(x.lower()) for x in ai_tech)
df['counter'] = df['text'].apply(func)
print(df)
# ------------------
text counter
0 1. More details A I Artificial Intelligence 2
1 2. NLP works very well these days 1
2 3. receiving information at the right time 0