Vader 词典结果加起来不等于 1.0
Vader lexicon results don't add up to 1.0
我的数据是来自 Stocktwits 的推文,我尝试使用 python 中的 Vader 库进行情绪分析。
问题是正、中和负字段加起来不等于 1.0。而不是这个,它们加起来是 2.0.
{'neg': 0.0, 'neu': 2.0, 'pos': 0.0, 'compound': 0.0}
这正常吗?
是的,这很正常。 example in the docs 显示类似的结果:
VADER is smart, handsome, and funny.----------------------------- {'pos': 0.746, 'compound': 0.8316, 'neu': 0.254, 'neg': 0.0}
VADER is smart, handsome, and funny!----------------------------- {'pos': 0.752, 'compound': 0.8439, 'neu': 0.248, 'neg': 0.0}
...
VADER is not smart, handsome, nor funny.------------------------- {'pos': 0.0, 'compound': -0.7424, 'neu': 0.354, 'neg': 0.646}
The pos
, neu
, and neg
scores are ratios for proportions of text that fall in each category (so these should all add up to be 1... or close to it with float operation). These are the most useful metrics if you want multidimensional measures of sentiment for a given sentence.
您可能想要使用 compound
分数:
The compound
score is computed by summing the valence scores of each word in the lexicon, adjusted according to the rules, and then normalized to be between -1 (most extreme negative) and +1 (most extreme positive). This is the most useful metric if you want a single unidimensional measure of sentiment for a given sentence. Calling it a 'normalized, weighted composite score' is accurate.
It is also useful for researchers who would like to set standardized thresholds for classifying sentences as either positive, neutral, or negative.
我的数据是来自 Stocktwits 的推文,我尝试使用 python 中的 Vader 库进行情绪分析。 问题是正、中和负字段加起来不等于 1.0。而不是这个,它们加起来是 2.0.
{'neg': 0.0, 'neu': 2.0, 'pos': 0.0, 'compound': 0.0}
这正常吗?
是的,这很正常。 example in the docs 显示类似的结果:
VADER is smart, handsome, and funny.----------------------------- {'pos': 0.746, 'compound': 0.8316, 'neu': 0.254, 'neg': 0.0}
VADER is smart, handsome, and funny!----------------------------- {'pos': 0.752, 'compound': 0.8439, 'neu': 0.248, 'neg': 0.0}
...
VADER is not smart, handsome, nor funny.------------------------- {'pos': 0.0, 'compound': -0.7424, 'neu': 0.354, 'neg': 0.646}
The
pos
,neu
, andneg
scores are ratios for proportions of text that fall in each category (so these should all add up to be 1... or close to it with float operation). These are the most useful metrics if you want multidimensional measures of sentiment for a given sentence.
您可能想要使用 compound
分数:
The
compound
score is computed by summing the valence scores of each word in the lexicon, adjusted according to the rules, and then normalized to be between -1 (most extreme negative) and +1 (most extreme positive). This is the most useful metric if you want a single unidimensional measure of sentiment for a given sentence. Calling it a 'normalized, weighted composite score' is accurate.It is also useful for researchers who would like to set standardized thresholds for classifying sentences as either positive, neutral, or negative.