如何计算具有文本和日期值的 .csv 文件列表中特定 positive/negative 个单词的频率？在 R

Question

我正在尝试从包含消息、特定用户和日期的文档中获取情绪。我已经清理了两个文件，使它们中包含的单词具有标准格式，然后我尝试对它们进行计数，但我似乎能够单独对它们进行计数（在定义单词之后），但不能使用列表字。

文件。 raw格式为：text,user_id, date，正负列表格式为：id,word_cz, polarity

file.raw <- read.csv("/Users/tomas/Desktop/Repromeda - Repromeda 3.csv", stringsAsFactors = FALSE,)
positive <- read.csv("/Users/tomas/Desktop/positive.txt", stringsAsFactors = FALSE,)
negative <- read.csv("/Users/tomas/Desktop/negative.txt", stringsAsFactors = FALSE,)

我可以用函数

来计算像 "Okay" 这样的特定单词

getCount <- function(data,keywords)
{
  wordcount <- str_count(file.raw&text, keywords)
  return(data.frame(data,wordcount))
}
file.raw$count <-  getCount(file.raw&text,"okay")

) 但我似乎无法找到一种方法来使用单词列表自动执行此过程

理想的结果是为每一行的每个正计数和负计数添加一列

感谢您的帮助

Answer 1

这个怎么样？

library(stringr)
data <- "yes i had a great time yesterday having fun but your lame actions were disturbing, ok?"
positive <- c("yes" , "ok", "fun", "great")
negative <- c("lame" , "disturbing", "no") 

sapply(positive, function(x) str_count(data,x))
sapply(negative, function(x) str_count(data,x))

如何计算具有文本和日期值的 .csv 文件列表中特定 positive/negative 个单词的频率？在 R

How to count frequency of specific positive/negative words from a list in a .csv file with text and date values? in R

csv

r

list

count

sentiment-analysis