如何将每个句子拆分为单个单词和每个句子的平均极性分数并附加到数据框中的新列?

How to split every sentence into individual words and average polarity score per sentence and append into new column in dataframe?

我可以成功地将一个句子拆分成单独的单词,并使用此代码对每个单词的极性得分进行平均。它很好用。

import statistics as s
from textblob import TextBlob

a = TextBlob("""Thanks, I'll have a read!""")
print(a)

    c=[]
    for i in a.words: 
        c.append(a.sentiment.polarity)
        d = s.mean(c)


d = 0.25
a.words = WordList(['Thanks', 'I', "'ll", 'have', 'a', 'read'])

如何将上面的代码转换成这样的df?:

df

     text
1    Thanks, I’ll have a read!

但是取每个词的每个极性的平均值?

壁橱是我可以对 df 中的每个句子的每个句子应用极性:

def sentiment_calc(text):
    try:
        return TextBlob(text).sentiment.polarity
    except:
        return None

df_sentences['sentiment'] = df_sentences['text'].apply(sentiment_calc)

我的印象是情绪极性只适用于 TextBlob 类型。

所以我的想法是将文本 blob 拆分为单词(使用拆分函数——参见文档 here)并将它们转换为 TextBlob 对象。 这是在列表理解中完成的:

[TextBlob(x).sentiment.polarity for x in a.split()]

所以整个事情看起来像这样:

import statistics as s
from textblob import TextBlob
import pandas as pd

a = TextBlob("""Thanks, I'll have a read!""")

def compute_mean(a):
    return s.mean([TextBlob(x).sentiment.polarity for x in a.split()])

print(compute_mean("Thanks, I'll have a read!"))

df = pd.DataFrame({'text':["Thanks, I'll have a read!",
    "Second sentence",
    "a bag of apples"]})

df['score'] = df['text'].map(compute_mean)
print(df)