拉普拉斯估计和预期似然估计之间的区别?
Difference between Laplace estimate and Expected likelihood estimate?
我正在使用 Python 进行情绪分析研究,目前我对 nltk.probability
感到有些困惑
拉普拉斯估计和预期似然估计有什么区别?
对于情绪分析研究,这两者中合适的平滑技术是什么?
这是来自 NLTK documentation -
的定义
The Laplace estimate for the probability distribution of the
experiment used to generate a frequency distribution. The "Laplace
estimate" approximates the probability of a sample with count c from
an experiment with N outcomes and B bins as
(c+1)/(N+B). This is equivalent to adding one to the count for each bin, and taking the maximum likelihood estimate of the resulting
frequency distribution.
The expected likelihood estimate for the probability distribution of
the experiment used to generate a frequency distribution. The
"expected likelihood estimate" approximates the probability of a
sample with count c from an experiment with N outcomes and
B bins as (c+0.5)/(N+B/2). This is equivalent to adding 0.5 to the count for each bin, and taking the maximum likelihood estimate of
the resulting frequency distribution.
当存在大量未见过的可能事件时,拉普拉斯技术将几乎所有概率质量分配给未见过的数据。 ELE 通过使 alpha 更小 - 0.5 来对此进行补偿,从而为看不见的事件分配更少的数据。
查看 here 了解更多详情
我正在使用 Python 进行情绪分析研究,目前我对 nltk.probability
感到有些困惑拉普拉斯估计和预期似然估计有什么区别? 对于情绪分析研究,这两者中合适的平滑技术是什么?
这是来自 NLTK documentation -
的定义The Laplace estimate for the probability distribution of the experiment used to generate a frequency distribution. The "Laplace estimate" approximates the probability of a sample with count c from an experiment with N outcomes and B bins as (c+1)/(N+B). This is equivalent to adding one to the count for each bin, and taking the maximum likelihood estimate of the resulting frequency distribution.
The expected likelihood estimate for the probability distribution of the experiment used to generate a frequency distribution. The "expected likelihood estimate" approximates the probability of a sample with count c from an experiment with N outcomes and B bins as (c+0.5)/(N+B/2). This is equivalent to adding 0.5 to the count for each bin, and taking the maximum likelihood estimate of the resulting frequency distribution.
当存在大量未见过的可能事件时,拉普拉斯技术将几乎所有概率质量分配给未见过的数据。 ELE 通过使 alpha 更小 - 0.5 来对此进行补偿,从而为看不见的事件分配更少的数据。
查看 here 了解更多详情