使用 TextBlob 进行情感分析中的极性计算
Polarity calculation in Sentiment Analysis using TextBlob
使用Text Blob的PatternAnalyser如何计算句子中单词的极性?
TextBlob 内部使用 NaiveBayes 分类器进行情感分析,
依次使用的naivebayes分类器是NLTK提供的。
查看 Textblob 情绪分析器代码 here。
@requires_nltk_corpus
def train(self):
"""Train the Naive Bayes classifier on the movie review corpus."""
super(NaiveBayesAnalyzer, self).train()
neg_ids = nltk.corpus.movie_reviews.fileids('neg')
pos_ids = nltk.corpus.movie_reviews.fileids('pos')
neg_feats = [(self.feature_extractor(
nltk.corpus.movie_reviews.words(fileids=[f])), 'neg') for f in neg_ids]
pos_feats = [(self.feature_extractor(
nltk.corpus.movie_reviews.words(fileids=[f])), 'pos') for f in pos_ids]
train_data = neg_feats + pos_feats
#### THE CLASSIFIER USED IS NLTK's NAIVE BAYES #####
self._classifier = nltk.classify.NaiveBayesClassifier.train(train_data)
def analyze(self, text):
"""Return the sentiment as a named tuple of the form:
``Sentiment(classification, p_pos, p_neg)``
"""
# Lazily train the classifier
super(NaiveBayesAnalyzer, self).analyze(text)
tokens = word_tokenize(text, include_punc=False)
filtered = (t.lower() for t in tokens if len(t) >= 3)
feats = self.feature_extractor(filtered)
#### USE PROB_CLASSIFY method of NLTK classifer #####
prob_dist = self._classifier.prob_classify(feats)
return self.RETURN_TYPE(
classification=prob_dist.max(),
p_pos=prob_dist.prob('pos'),
p_neg=prob_dist.prob("neg")
)
NLTK 的 NaiveBayes 分类器的来源是 here.。此 returns 概率分布用于 Textblobs 情感分析器返回的结果。
def prob_classify(self, featureset):
使用Text Blob的PatternAnalyser如何计算句子中单词的极性?
TextBlob 内部使用 NaiveBayes 分类器进行情感分析, 依次使用的naivebayes分类器是NLTK提供的。
查看 Textblob 情绪分析器代码 here。
@requires_nltk_corpus
def train(self):
"""Train the Naive Bayes classifier on the movie review corpus."""
super(NaiveBayesAnalyzer, self).train()
neg_ids = nltk.corpus.movie_reviews.fileids('neg')
pos_ids = nltk.corpus.movie_reviews.fileids('pos')
neg_feats = [(self.feature_extractor(
nltk.corpus.movie_reviews.words(fileids=[f])), 'neg') for f in neg_ids]
pos_feats = [(self.feature_extractor(
nltk.corpus.movie_reviews.words(fileids=[f])), 'pos') for f in pos_ids]
train_data = neg_feats + pos_feats
#### THE CLASSIFIER USED IS NLTK's NAIVE BAYES #####
self._classifier = nltk.classify.NaiveBayesClassifier.train(train_data)
def analyze(self, text):
"""Return the sentiment as a named tuple of the form:
``Sentiment(classification, p_pos, p_neg)``
"""
# Lazily train the classifier
super(NaiveBayesAnalyzer, self).analyze(text)
tokens = word_tokenize(text, include_punc=False)
filtered = (t.lower() for t in tokens if len(t) >= 3)
feats = self.feature_extractor(filtered)
#### USE PROB_CLASSIFY method of NLTK classifer #####
prob_dist = self._classifier.prob_classify(feats)
return self.RETURN_TYPE(
classification=prob_dist.max(),
p_pos=prob_dist.prob('pos'),
p_neg=prob_dist.prob("neg")
)
NLTK 的 NaiveBayes 分类器的来源是 here.。此 returns 概率分布用于 Textblobs 情感分析器返回的结果。
def prob_classify(self, featureset):