将情绪列添加到 r 中的数据集

add a sentiment column onto a dataset in r

我在 r 中做了一些基本的情感分析,想知道是否有办法分析一个句子或一行的情感,然后在一列中附加句子的情感。到目前为止,我所做的所有分析都为我提供了情绪概览或提取特定词语,但没有 link 返回原始数据行

我的数据输入将通过 BI 软件输入,看起来如下所示,带有案例编号和一些文本:

"12345","I am extremely angry with my service"
"23456","I was happy with how everything turned out"
"34567","The rep did a great job helping me"

我希望它作为下面的输出返回

"12345","I am extremely angry with my service","Anger"
"23456","I was happy with how everything turned out","Positive"
"34567","The rep did a great job helping me","Positive"

任何指向包或资源正确方向的观点都将不胜感激!

您 运行 遇到的句子问题是情感词典是基于单词的。如果您查看 nrc 词典,"angry" 一词具有三个情感值:愤怒、厌恶和消极。你选择哪一个?或者你有一个句子返回词典中的多个单词。尝试用您的文本测试不同的词典,看看会发生什么,例如 tidytext

如果想要一个可以在句子层面分析情绪的包,你可以看看sentimentr。您不会得到像愤怒这样的情绪值,而是 sentiment/polarity 分数。有关 sentimentr 的更多信息,请参阅 package documentation and on sentimentr github 页面。

一个小示例代码:

library(sentimentr)
text <- data.frame(id = c("12345","23456","34567"),
                   sentence = c("I am extremely angry with my service", "I was happy with how everything turned out", "The rep did a great job helping me"),
                   stringsAsFactors = FALSE)



sentiment(text$sentence)
   element_id sentence_id word_count  sentiment
1:          1           1          7 -0.5102520
2:          2           1          8  0.2651650
3:          3           1          8  0.3535534

# add sentiment score to data.frame
text$sentiment <- sentiment(text$sentence)$sentiment 

text
     id                                   sentence  sentiment
1 12345       I am extremely angry with my service -0.5102520
2 23456 I was happy with how everything turned out  0.2651650
3 34567         The rep did a great job helping me  0.3535534